cointReg
Parameter Estimation and Inference in a Cointegrating Regression
view repo
IV regression in the context of a re-sampling is considered in the work. Comparatively, the contribution in the development is a structural identification in the IV model. The work also contains a multiplier-bootstrap justification.
READ FULL TEXT VIEW PDF
Recent years have witnessed an upsurge of interest in employing flexible...
read it
We present a novel single-stage procedure for instrumental variable (IV)...
read it
This paper illustrates two algorithms designed in Forneron Ng (2020)...
read it
We provide general formulation of weak identification in semiparametric
...
read it
We show that the popular reinforcement learning (RL) strategy of estimat...
read it
In this paper I revisit the interpretation of the linear instrumental
va...
read it
We present algorithms for the type-IV discrete cosine transform (DCT-IV)...
read it
Parameter Estimation and Inference in a Cointegrating Regression
K-class methods for instrumental variables regressions including OLS, two-stage least squares, LIML, Fuller, and generalized K-class methods.
Instrumental variables regression in matlab
:exclamation: This is a read-only mirror of the CRAN R package repository. cointReg — Parameter Estimation and Inference in a Cointegrating Regression. Homepage: https://github.com/aschersleben/cointReg Report bugs for this package: https://github.com/aschersleben/cointReg/issues
In the work a non-parametric regression with instrumental variables is considered. A general framework is introduced and identification of a target of inference is discussed. Furthermore, multiplier bootstrap in a general form is considered and justified. Moreover, the procedure is used to test a hypothesis on a target function.
Introduce independent identically distributed observations
(2.1) |
from a sample set
on a probability space
. Letbe a compact and random variables are respectively coming from
, and .Assume a system of non-linear equations
(2.2) |
A parametric relaxation of the system introduces a non-parametric bias. For an orthonormal functional basis define
(2.3) |
such that
Then a substitution transforms (2.2) and gives
(2.4) |
with a bias
(2.5) |
Particular case of (2.4) under parametric assumption () and with a single instrument () is a popular choice of a model with instrumental variables ([1],[8]). The system is rewritten as
(2.6) |
with the definition .
The second statement helps to obtain exact form of a solution to (2.6)
(2.7) |
Hence, the correlation of instrumental variable with features (note ) identifies (up to a scaling) making the choice of the variable a crucial task. An empirical relaxation to (2.6) in the literature (see [1],[8]) closely resembles the following form
(2.8) |
for , , , and
or alternatively (lemma [2.1])
corresponding to the latter system up to a notational convention
The model was theoretically and numerically investigated in a number of papers (see [1],[8]) and in the article (see ’Numerical’) is used as a numerical benchmark.
The model (2.4) turns into
(2.10) |
A solution to (2.10) is an intersection of a
-sphere and a hyperplane
. If it is unique the hyperplane is a tangent linear subspace to the -sphere and the optimization procedure (2.9) is solved by definition of the intersection point. Conversely, if there exist a solution to the optimization problem then it is guaranteed to be unique as a solution to a convex problem with linear constraints and by definition satisfy (2.4).Redefine
(2.11) |
on a probability space . Let be a compact, random variables from , , and let the observations identify uniquely a solution to the system
(2.12) |
in the particular case with
Identification in non iid case complicates the fact that is normally larger than leading to possibly different identifiability scenarios. Distinguish them based on a rank of a matrix
(2.13) |
Note that the rank and, thus, a solution to [2.12] depends on a sample size ( is assumed to be fixed). However, there is no prior knowledge of what corresponds to the identifiable function . Therefore, the discussion requires an agreement on the target of inference.
A way to reconcile uniqueness with the observed dependence is to require the function and to be independent from . The model (2.12) makes sense if it points consistently at a single function independently from a number of observations. Define accordingly a target function.
Assume s.t. the rank , then call a function a target if it solves (2.12) .
In the case of a bias between a solution and the target has to be considered. However, in the subsequent text it is implicitly assumed that a sample size .
Based on the convention [2.3] introduce a classification:
Complete model: s.t. the rank .
Incomplete model: s.t the rank .
Identification in the ’incomplete’ model is equivalent to the iid case with the notational change for the number of instruments and respective change of equations with instruments to the equations from (2.12). Otherwise ’completeness’ of a model allows for a direct inversion of (2.12). Generally a complete model is given without the restriction
(2.14) |
In this case a natural objective function for an inference is a quasi log-likelihood
(2.15) |
again with
and
Introduce an empirical relaxation of the biased (2.4)
(3.1) |
with centered errors . Courtesy of the lemma [2.2], a natural objective function is a penalized quasi log-likelihood
(3.2) |
with
Maximum likelihood estimator (MLE) and its target are given
For a fixed projector introduce a linear hypothesis and define a log-likelihood ratio test
(3.3) |
The test weakly converges
to chi-square distribution (theorem
4.3) and it is convenient to define a quantile as
It implies that and that weakly depends on a dimension s.t. , .
For a set of re-sampling multipliers
define bootstrap conditional on the original data
and corresponding bootstrap MLE (bMLE) and its target
A centered hypothesis and a respective test are defined accordingly
(3.4) |
And analogously . The theorem [4.4] enables the same convergence in growing dimension .
Under parametric assumption - the non-parametric bias is zero - the bootstrap log-likelihood test is empirically attainable and the quantile is computed explicitly. On the other hand an unattainable quantile calibrates . Between the two exists a direct correspondence. In the section [LABEL:GCA] it is demonstrated that can be used instead of .
Multiplier bootstrap procdeure: | (3.5) |
Sample computing satisfying
Test against using the inequalities
The idea is numerically validated in the section ’Numerical’. Its theoretical justification follows immediately.
In a most general case neither an objective estimates consistently nor a model (2.1) is justified as a suitable for arbitrary . Moreover, a regression with instrumental variables adds an additional concern, chosen instruments can be weakly identified (see section [7.1]) and an inference in the problem might involve a separate testing on weakness complicating an original problem.
Finite sample approach (Spokoiny 2012 [9]) is an option to merry a structure of with a properties of a probability space (2.1) and automatically account for an unknown nature of instruments in a regression problem.
Finite sample theory: | (4.1) |
Suppose conditions (4.1) are fulfilled. Define a score vector
then it holds with a universal constant
at least with the probability .
Bootstrap analogue of the Wilks expansion also follows. It was claimed in theorem B.4, section B.2 in Spokoiny, Zhilova 2015 [11].
Suppose conditions (4.1) are fulfilled. Define a bootstrap score vector
then it holds with a universal constant
at least with the probability .
Moreover, the log-likelihood statistic follows the same local approximation in the context of hypothesis testing and the satisfies (see appendix - section (8.5)).
Assume conditions (4.1) are satisfied then with a universal constant
with probability . The score vector is defined respectively
and Fisher information matrix
Similar statement can be proven in the bootstrap world.
Assume conditions (4.1) are fulfilled then with probability holds
with a universal constant , where a score vector is given
The theorem is effectively the same for as the re-sampling procedure replicates sufficient for the statement assumptions of a quasi log-likelihood (shown in section 8.3 Appendix).
In view of the re-sampling justification a separate discussion deserves a small modeling bias from Spokoiny, Zhilova 2015 [11]. The condition appears from the general way to prove the re-sampling procedure. Namely, for a small error term it is claimed
with the matrices
where the term is assumed to be of the error order essentially meaning that the deterministic bias is small. However, the assumption
appears in the current development only in the form of the condition ’Target’ in (4.1). The substitution is possible due to the next lemma.
Assume that the condition ’Target’ holds, then .
By definition of a target of estimation
The condition ’Target’ implies that . Meaning, that any particular choice of the term with the index is also zero - . Thus, and the statement follows. ∎
There are two results that constitute a basis for the re-sampling (3.5). The first - Gaussian comparison - is taken from Götze, F. and Naumov, A. and Spokoiny, V. and Ulyanov, V. [4] and adapted to the needs and notations in the work.
Assume centered Gaussian vectors and then it holds
with a universal constant , where stands for the operator norm of a matrix.
The second - Gaussian approximation - has been developed in the appendix (section [8.7]).
Introduce the notations for the vectors
such that
and are independent and sub-Gaussian
.
Then a simplified version of the theorem [8.27] from the appendix holds.
Assume the framework above, then
with the universal constant .
Finally, the critical value and the empirical are glued together by a matrix concentration inequalities from the section (8.6).
The essence of the re-sampling is to translate the closeness of and into the closeness of the matrices -with the help of the Wilks expansion (theorems [4.3,4.4]) and Gaussian comparison result - and approximate unknown by the respective Gaussian counterparts. It all amounts to the central theorem.
The parametric model (
2.4) in the introduction - - under the assumption (4.1) enableswith a dominating probability and universal constants .
Note that the critical value depends on experimental data at hand and is fixed when the expectation is taken with respect to the data generating statistics.
Calibrate BLR test on a model from Andrews, Moreira and Stock [1]. In the paper the authors proposed conditional likelihood ratio test (CLR - ) used here as a benchmark. The simulated model reads as
(6.1) |
(6.2) |
where , , and with a matrix , and (see section 1). And the hypothesis
on a value of a structural parameter . For the hypothesis Moreira [8] and later Andrews, Moreira and Stock [1] construct a CLR test based on the two vectors
and
with the notations , and . and are independent and together present sufficient statistics for the model (6.1) with only depending on instruments’ identification, thus conditioning on and CLR test. Log-likelihood ratio statistics in (6.1) is represented as (see Moreira 2003 [8]) -
Additionally Lagrange multiplier and Anderson-Rubin tests are given by
The latter two are known to perform acceptably except for weakly identified case.
First, correctly specified model is generated for the sample of and with weak instruments (). In this case powers of , and true tests are drawn on the figure (8.1). To be consistent is also compared to and . The comparison is given on the figure (8.2) and the data in the case is aggregated in the table (1).
Moreover an important step is to check how robust to a misspecification of the model. Three special examples are simulated:
,
,
.
Experiment (1) can be found on the figures (8.3), (8.4) and in the table (2). Numerical study of the experiment (2) with misspecified heteroskedastic error is given on the figure (8.5) and collected in the table (3). The last experiment is shown on the figure (8.6) and in the table (4).
All the figures and tables are collected in the end of the work.
On practice one wants to distinguish instruments based on its strength. For the clarity of exposition the section considers a simplified log-likelihood (2.15) identifying complete model with the Fisher information matrix
Weak instrumental variables introduce an unavoidable lower bound on estimation error (lemma [7.1], see the proof in the appendix (8.1)).
Comments
There are no comments yet.