# How to specify a model in fHMM?

#### 2021-06-16

To specify a model in fHMM, you only need to define the named list controls and pass it to fit_hmm, i.e.

1. define controls = list(<parameters>) and

2. call fit_hmm(controls) for estimation.

The list controls can contain the following parameters. You can either specify none, all, or selected parameters. Unspecified parameters are set to default values.

## path

A character, setting the path where model results get saved. The default is path = tempdir() (the path of the per-session temporary directory). We recommend to set a path to your current working directory, e.g. path = ".".

## id

A character, which identifies a model. The default is id = "test". We recommend to set a meaningful identification, e.g. id = "dax_hmm_2_states".

## model

This control determines the type model that you want to estimate and can be one of

• "hmm" (to estimate a hidden Markov model),

• "hhmm" (to estimate a hierarchical hidden Markov model).

The default is model = "hmm".

## states

This control determines the number of states and depends on the control model.

• If model = "hmm", states is an integer. If states = x, a hidden Marko model with x states is estimated.

• If model = "hhmm", states is a numeric vector of length two. If states = c(x,y), a hierarchical hidden Markov model with x coarse-scale and y fine-scale states is estimated.

The default is states = 2, which defines a 2-state hidden Markov model.

## sdds

This control determines the state-depenent distributions and depends on the control model.

• If model = "hmm", states is one of the following values.

• If model = "hhmm", states is a vector of length two, containing the following values. The first entry defines the state-depenent distribution for the coarse scale, the second entry for the fine scale, respectively.

The following state-depenent distributions can be defined:

• "t", the t-distribution,

• "t(x)", the t-distribution with x fixed degrees of freedom (x = Inf yields the normal distribution),

• "gamma", the gamma distribution.

For example, sdds = c("t","t(1)") defines the t-distribution for the coarse scale (degrees of freedom get estimated) and the t-distribution with one degree of freedom for the fine scale.

The default is sdds = "t", which defines state-dependent t-distributions.

## horizon

This control determines the length of the time horizon and depends on the control model.

• If model = "hmm", horizon is an integer and determines the length of the time horizon for simulated data.

• If model = "hhmm", horizon is a vector of length two. The first entry is numeric and defines the time horizon for the coarse scale for simulated data. The second entry defines the time horizon for the fine scale for simulated data and for empirical data and can be either numeric or one of

• "w" for weekly fine-scale chunks,

• "m" for monthly fine-scale chunks,

• "q" for quarterly fine-scale chunks,

• "y" for yearly fine-scale chunks.

The default is horizon = 1000, which simulates 1000 data points from a hidden Markov model.

## data

A list which contains the following controls that specify the processing of data. If data = NA (the default), data gets simulated.

### source

Either a character (if model = "hmm") or a character vector of length 2 (if model = "hhmm"), containing the file names of the empirical data.

• If source = NA or source = c(NA,NA), respectively, data is simulated.

• If source = "x", data “x.csv” in folder “path/data” is modeled by a HMM.

• If source = c("x","y"), data “x.csv” (type determined by cs_type) on the coarse scale and data “y.csv” in folder “path/data” on the fine scale is modeled by a HHMM.

### column

Either a character (if model = "hmm") or a character vector of length 2 (if model = "hhmm"), containing the names of the desired columns of source.

### truncate

A vector of length 2, containing lower and upper date limits (each in format "YYYY-MM-DD") to select a subset of the empirical data (neither, one or both limits can be specified).

The default is truncate = c(NA,NA), i.e. no limits get specified.

### cs_transform

A character, determining the transformation of empirical coarse-scale data in hierarchical hidden Markov models.

The character has to be a function of x, where x is the corresponding fine-scale data. The function must concentrate x into a single value.

For example,

• cs_transform = "mean(x)" defines the mean of the fine-scale data as the coarse-scale observation,

• cs_transform = "mean(abs(x))" defines the mean of the absolute values of the fine-scale data as the coarse-scale observation,

• cs_transform = "sum(abs(x))" defines the sum of the absolute values of the fine-scale data as the coarse-scale observation,

• cs_transform = "(tail(x,1)-head(x,1))/head(x,1)" defines the relative change of the first and the last fine-scale observation as the coarse-scale observation.

Practically, any transformation can be defined.

### log_returns

Either a boolean (if model = "hmm") or a boolean vector of length 2 (if model = "hhmm"), determining whether empirical data should be transformed to log-returns.

## fit

A list which contains the following controls that specify the estimation.

### runs

A numeric value, setting the number of optimization runs.

The default is runs = 100.

### at_true

A boolean, determining whether the optimization is initialized at the true parameter values.

Only for simulated data, sets runs = 1 and accept = "all".

The default is at_true = FALSE.

### seed

A numeric value, setting a seed for the simulation and the optimization.

Per default, no seed is set.

### accept

Either a numeric vector (containing acceptable exit codes of the nlm optimization) or the character "all" (accepting all codes).

The default is accept = c(1,2).

### print.level

Passed on to nlm.

The default is print_level = 0.

### gradtol

Passed on to nlm.

The default is gradtol = 1e-6.

### stepmax

Passed on to nlm.

The default is stepmax = 1.

### steptol

Passed on to nlm.

The default is steptol = 1e-6.

### iterlim

Passed on to nlm.

The default is iterlim = 200.

### scale_par

Either a numeric (if model = "hmm") or a numeric vector of length 2 (if model = "hhmm"), scaling the model parameters in a simulation on the coarse and fine scale, respectively.

The default is scale_par = 1, i.e. the model parameters for simulation do not get scaled.

## results

A list which contains the following controls that specify the output.

### overwrite

A boolean, determining whether overwriting of existing results (on the same id) is allowed. It is set to TRUE if id = "test".

The default is overwrite = FALSE.

### ci_level

A numeric value between 0 and 1, setting the level for the confidence intervals of the model parameters.

The default is ci_level = 0.95.