Time-varying GEV model
A data frame containing at least the two required
variables with their names given in date
and
response
.
Character. Name of the date variable in data
.
Character. Name of the response variable in
data
.
A call to a function creating a data frame with all
the variables required in the formulas loc
, scale
and shape
. While the content of data
is mainly used
for the creation of the TVGEV
object, design
will be
used in prediction or computation of Return Levels.
A formula linking linearly the GEV location parameter
to the columns in the data frame created with design
. The
Rogers-Wilkinson style of formula is used here (as in lm
):
only the variables are given, the parameter names are created in
relation to the variables names.
A formula for the GEV scale parameter.
A formula for the GEV shape parameter.
A numeric vector of initial values for the parameters. By default a vector is computed using a linear model.
Character giving the package or function to be used for the likelihood maximisation.
Logical. If TRUE
the value of the parameter
vector is tracked during the optimisation (i.e. at each call of
the objective) and all the values are returned as a matrix
"psi"
in a list named "tracked"
which contains as
well the value of the objective.
Numeric vectors or NULL
giving
bounds on the parameters. These are used overwrite the "normal"
values which are -Inf
and Inf
. These arguments are
only used when estim
is "nloptr"
and are anyway not
taken into account in the inference. So the inference is
misleading when some box constraints are active at the optimum,
see MLE.TVGEV
.
Integer level of verbosity.
An object with class "TVGEV"
, a list. Among the
list elements
The call. This will be required.
Copies of the inputs.
The name of the response and of the
date columns in data
.
A list with the terms for the three GEV parameters.
The results of the optimisation, if relevant.
The numeric vector of estimates for the vector
psi
.
A matrix with three columns containing the GEV
parameters loc
, scale
and shape
.
This kind of model describe independent observations having
a Generalized Extreme Value (GEV) distribution depending on time.
The three GEV parameters ("location"
, "scale"
and
"shape"
) can depend on a date variable used as a covariate
in a linear-model style of dependence. The covariates of the model
are functions of the date as provided in breaksX
,
polynomX
, trigonX
. No other covariates can be used
unless the use of predict
or RL
will not be possible.
When the response contains NA, the corresponding elements
are discarded in the computation of the log-likelihood. However the
corresponding rows exist in the matrix theta
.
The call passed in the design
formal will
be reused in prediction. Passing a merely data frame would make
prediction impossible because there would then be no way to guess
how the covariates of the model were created from the date.
MLE.TVGEV
for some details on the
maximum-likelihood estimation.
## transform a numeric year into a date
df <- within(TXMax_Dijon, Date <- as.Date(sprintf("%4d-01-01", Year)))
df0 <- subset(df, !is.na(TXMax))
## fit a TVGEV model. Only the location parameter is TV.
t1 <- system.time(
res1 <- TVGEV(data = df, response = "TXMax", date = "Date",
design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
loc = ~ t1 + t1_1970))
## The same using "nloptr" optimisation.
t2 <- system.time(
res2 <- TVGEV(data = df, response = "TXMax", date = "Date",
design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
loc = ~ t1 + t1_1970,
estim = "nloptr",
parTrack = TRUE))
## use extRemes::fevd the required variables need to be added to the data frame
## passed as 'data' argument
t0 <- system.time({
df0.evd <- cbind(df0, breaksX(date = df0$Date, breaks = "1970-01-01",
degree = 1));
res0 <- fevd(x = df0.evd$TXMax, data = df0.evd, loc = ~ t1 + t1_1970)
})
## compare estimate and negative log-liks
cbind("fevd" = res0$results$par,
"TVGEV_optim" = res1$estimate,
"TVGEV_nloptr" = res2$estimate)
#> fevd TVGEV_optim TVGEV_nloptr
#> mu0 32.06678895 32.06638460 32.06679233
#> mu1 -0.02391857 -0.02392656 -0.02391860
#> mu2 0.07727041 0.07728411 0.07727031
#> scale 1.75585289 1.75541862 1.75585346
#> shape -0.18130928 -0.18112018 -0.18130938
cbind("fevd" = res0$results$value,
"VGEV_optim" = res1$negLogLik,
"TVGEV_nloptr" = res2$negLogLik)
#> fevd VGEV_optim TVGEV_nloptr
#> [1,] 177.2014 177.2014 177.2014
## ====================================================================
## use a loop on plausible break years. The fitted models
## are stored within a list
## ====================================================================
if (FALSE) {
yearBreaks <- c(1940, 1950, 1955, 1960:2000, 2005, 2010)
res <- list()
for (ib in seq_along(yearBreaks)) {
d <- sprintf("%4d-01-01", yearBreaks[[ib]])
floc <- as.formula(sprintf("~ t1 + t1_%4d", yearBreaks[[ib]]))
res[[d]] <- TVGEV(data = df, response = "TXMax", date = "Date",
design = breaksX(date = Date, breaks = d, degree = 1),
loc = floc)
}
## [continuing...] ]find the model with maximum likelihood, and plot
## something like a profile likelihood for the break date considered
## as a new parameter. However, the model is not differentiable w.r.t.
## the break!
ll <- sapply(res, logLik)
plot(yearBreaks, ll, type = "o", pch = 21, col = "orangered",
lwd = 2, bg = "gold", xlab = "break", ylab = "log-lik")
grid()
iMax <- which.max(ll)
abline(v = yearBreaks[iMax])
abline(h = ll[iMax] - c(0, qchisq(0.95, df = 1) /2),
col = "SpringGreen3", lwd = 2)
}