Time-varying GEV model

TVGEV(data, date, response, design = NULL,
      loc = ~ 1, scale = ~ 1, shape = ~ 1,
      psi0 = NULL, estim = c("optim", "nloptr", "none"),
      parTrack = FALSE,
      coefLower = if (estim != "nloptr") numeric(0) else c("xi_0" = -0.9),
      coefUpper = if (estim != "nloptr") numeric(0) else c("xi_0" = 2.0),
      trace = 0)

Arguments

data

A data frame containing at least the two required variables with their names given in date and response.

date

Character. Name of the date variable in data.

response

Character. Name of the response variable in data.

design

A call to a function creating a data frame with all the variables required in the formulas loc, scale and shape. While the content of data is mainly used for the creation of the TVGEV object, design will be used in prediction or computation of Return Levels.

loc

A formula linking linearly the GEV location parameter to the columns in the data frame created with design. The Rogers-Wilkinson style of formula is used here (as in lm): only the variables are given, the parameter names are created in relation to the variables names.

scale

A formula for the GEV scale parameter.

shape

A formula for the GEV shape parameter.

psi0

A numeric vector of initial values for the parameters. By default a vector is computed using a linear model.

estim

Character giving the package or function to be used for the likelihood maximisation.

parTrack

Logical. If TRUE the value of the parameter vector is tracked during the optimisation (i.e. at each call of the objective) and all the values are returned as a matrix "psi" in a list named "tracked" which contains as well the value of the objective.

coefLower, coefUpper

Numeric vectors or NULL giving bounds on the parameters. These are used overwrite the "normal" values which are -Inf and Inf. These arguments are only used when estim is "nloptr" and are anyway not taken into account in the inference. So the inference is misleading when some box constraints are active at the optimum, see MLE.TVGEV.

trace

Integer level of verbosity.

Value

An object with class "TVGEV", a list. Among the list elements

call

The call. This will be required.

data, design, loc, scale, shape

Copies of the inputs.

response, date

The name of the response and of the date columns in data.

terms

A list with the terms for the three GEV parameters.

opt

The results of the optimisation, if relevant.

estimate

The numeric vector of estimates for the vector psi.

theta

A matrix with three columns containing the GEV parameters loc, scale and shape.

Details

This kind of model describe independent observations having a Generalized Extreme Value (GEV) distribution depending on time. The three GEV parameters ("location", "scale" and "shape") can depend on a date variable used as a covariate in a linear-model style of dependence. The covariates of the model are functions of the date as provided in breaksX, polynomX, trigonX. No other covariates can be used unless the use of predict or RL will not be possible.

Note

When the response contains NA, the corresponding elements are discarded in the computation of the log-likelihood. However the corresponding rows exist in the matrix theta.

Caution

The call passed in the design formal will be reused in prediction. Passing a merely data frame would make prediction impossible because there would then be no way to guess how the covariates of the model were created from the date.

See also

MLE.TVGEV for some details on the maximum-likelihood estimation.

Author

Yves Deville

Examples


## transform a numeric year into a date
df <- within(TXMax_Dijon, Date <- as.Date(sprintf("%4d-01-01", Year)))
df0 <- subset(df, !is.na(TXMax))

## fit a TVGEV model. Only the location parameter is TV.
t1 <- system.time(
    res1 <- TVGEV(data = df, response = "TXMax", date = "Date",
                  design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
                  loc = ~ t1 + t1_1970))

## The same using "nloptr" optimisation.
t2 <- system.time(
    res2 <- TVGEV(data = df, response = "TXMax", date = "Date",
                  design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
                  loc = ~ t1 + t1_1970,
                  estim = "nloptr",
                  parTrack = TRUE))

## use extRemes::fevd the required variables need to be added to the data frame
## passed as 'data' argument
t0 <- system.time({
   df0.evd <- cbind(df0, breaksX(date = df0$Date, breaks = "1970-01-01",
                    degree = 1));
   res0 <- fevd(x = df0.evd$TXMax, data = df0.evd, loc = ~ t1 + t1_1970)
 })

## compare estimate and negative log-liks
cbind("fevd" = res0$results$par,
      "TVGEV_optim" = res1$estimate,
      "TVGEV_nloptr" = res2$estimate)
#>              fevd TVGEV_optim TVGEV_nloptr
#> mu0   32.06678895 32.06638460  32.06679233
#> mu1   -0.02391857 -0.02392656  -0.02391860
#> mu2    0.07727041  0.07728411   0.07727031
#> scale  1.75585289  1.75541862   1.75585346
#> shape -0.18130928 -0.18112018  -0.18130938
cbind("fevd" = res0$results$value,
      "VGEV_optim" = res1$negLogLik,
      "TVGEV_nloptr" = res2$negLogLik)
#>          fevd VGEV_optim TVGEV_nloptr
#> [1,] 177.2014   177.2014     177.2014

## ====================================================================
## use a loop on plausible break years. The fitted models
## are stored within a list
## ====================================================================

if (FALSE) {

    yearBreaks <- c(1940, 1950, 1955, 1960:2000, 2005, 2010)
    res <- list()

    for (ib in seq_along(yearBreaks)) {
        d <- sprintf("%4d-01-01", yearBreaks[[ib]])
        floc <- as.formula(sprintf("~ t1 + t1_%4d", yearBreaks[[ib]]))
        res[[d]] <- TVGEV(data = df, response = "TXMax", date = "Date",
        design = breaksX(date = Date, breaks = d, degree = 1),
        loc = floc)
    }

    ## [continuing...] ]find the model with maximum likelihood, and plot
    ## something like a profile likelihood for the break date considered
    ## as a new parameter. However, the model is not differentiable w.r.t.
    ## the break! 

    ll <- sapply(res, logLik)
    plot(yearBreaks, ll, type = "o", pch = 21, col = "orangered",
         lwd = 2, bg = "gold", xlab = "break", ylab = "log-lik")
    grid()
    iMax <- which.max(ll)
    abline(v = yearBreaks[iMax])
    abline(h = ll[iMax] - c(0, qchisq(0.95, df = 1) /2),
           col = "SpringGreen3", lwd = 2)

}