Description

1 TÍTULO DE LA COMUNICACIÓN: Four Tes of Independence in spatio-temporal data AUTOR 1: Fernando A. López Hernández DEPARTAMENTO: Departamento de Métodos Cuantitativos e Informáticos

Information

Category:
## Health & Lifestyle

Publish on:

Views: 7 | Pages: 27

Extension: PDF | Download: 0

Share

Transcript

1 TÍTULO DE LA COMUNICACIÓN: Four Tes of Independence in spatio-temporal data AUTOR 1: Fernando A. López Hernández DEPARTAMENTO: Departamento de Métodos Cuantitativos e Informáticos UNIVERSIDAD: Politécnica de Cartagena AUTOR 3: Mariano Matilla DEPARTAMENTO: Departamento de Economía Aplicada UNIVERSIDAD: Universidad Nacional de Educación a Distancia AUTOR 3: Manuel Ruiz Marín DEPARTAMENTO: Departamento de Métodos Cuantitativos e Informáticos UNIVERSIDAD: Universidad Politécnica de Cartagena ÁREA TEMÁTICA: Métodos de análisis regional RESUMEN: This paper tries to extend the range of techniques for testing the hypothesis of complete spatio-temporal randomness in the case of a general type variable with a regional or spatial breakdown. The tes that we can find nowadays in the literature are not wellsuited to for the most part of series of interest. We have generalized the use of three popular tes of spatial dependence (namely, Moran s I, the spatial BDS and the BP tes) to which we add a Lagrange Multiplier test. Furthermore, with a Monte Carlo simulation, we show the finite sample behavior of the four tes for linear and non-linear processes. The paper finishes with an empirical application to the annual growth rates of employment in European regions. PALABRAS CLAVE: Independence; Spatiotemporal Data; Nonlinearity 1. Introduction The issues related to spatiotemporal data modelling occupy a prominent role in current Econometrics, where we can find a huge literature devoted to this topic (Baltagi et al, 007, and references therein). In i simplest form, the problem consis of explaining why the final result of a spatio-temporal process may be different from the mere sum of two processes, one that operates on a spatial base and the second along the time axis. As McAuliffe and Afifi (1984, p. 17) put it: Space-time clustering is said to exit if, among those even that are close in time, there are even that are closer in space than would be expected due to chance alone. Upton and Fingleton (1985, p. 04) restate in the same idea: in absence of contagion, the mechanisms which produce clustering in time and clustering in space might be expected to act independently. It is on this basis that we may consider the null hypothesis to represent the absence of the space-time interaction. Cressie (1993, p. 70) adop a slightly different perspective when he refers to a situation of complete spatio-temporal randomness (cstr in what follows), in the sense of the absence of any structure in time as well as in space. Obviously, the cstr hypothesis implies unconditional independence whereas the position of Upton and Fingleton (absence of contagion, aoc in advance) poin only to the independence of the two processes. The aoc hypothesis is the objective of the I s t space-time interaction coefficient of Knox (1964), who precisely defines the terms of the problem: It has three distinct componen ( ). The first is a concentration in time over the whole area of study. The second is concentration in space over the whole period of study. The third, an interaction between these two, may not be the least obvious but it is probably the most fundamental of the three because it is not dependent upon the choice of the studied limi (Knox, 1964, p. 9). Mantel (1967) transforms the Knox coefficient into a formal statistical test, which was subsequently generalized by Hubert et al. (1981). The hypothesis of separability, or how to disentangle the temporal from the spatial effec in the covariance matrix of the process, is a very important issue here (Cressie and Huang, 1999, Stein, 005, and references therein). Our contribution to this discussion is rather specific and is motivated, mainly, by the following observations. In first place, there are few well-established tes for analyzing the cstr assumption for a general type variable; this scarcity may become a problem for applied works. Secondly, although most of the existing tes are presented as tes of 3 spatiotemporal independence, they are really tes of noncorrelation. This is particularly true for the Knox approach. It is obvious that these tes will not be consistent against alternatives of non-linear dependence with zero autocorrelation. The topic of nonlinear dependence has been treated unfrequently in a spatial or spatiotemporal framework (Robinson, 009, is an exception) and needs much more attention. Thirdly, many of the existing tes rely on a huge parametric structure and sometimes require strong assumptions. Normality is a common restriction, which is not always adequate. Furthermore, since spatial data lack any natural order, the specification of spatial weighting matrix has become a common practice. However, as Pinkse (004) indicates, the specification of the weighting matrix forms part of the null hypotheses (cstr or aoc, does not matter) and the test becomes conditional on this matrix. The purpose of our work is to present a battery of tes to detect spatiotemporal relationships in a given variable; specifically, we focus in the cstr hypothesis. The paper consis of five sections. In the second, we introduce the techniques that seem more appropriate for our analysis. We extend three well-known statistics for spatial dependence (Moran s I, the spatial BDS and the Brett and Pinkse, BP, tes) to a spatiotemporal context, to which we add a Lagrange Multiplier obtained for this occasion. The third Section presen the resul of a Monte Carlo experiment whereas Section four includes an application of the four statistics to the case of the annual growth rate of employment in the European regions by sectors of activity. The paper finishes with a Section of conclusions and future perspectives.. Tes for Spatio-Temporal Dependence As said, there exis a growing interest in the modelling of spatiotemporal data, probably, because this type of data is now more easily available. This tendency requires the development of useful techniques to guide the specification search. In this sense, the hypothesis of spatiotemporal independence seems crucial. What is surprising is that few well-stablished alternatives to deal with this question can be find (the literature on epidemiology, very peculiar, is an exception; Song and Kulldorf, 003, for a review). Our intention is to fill this gap extending to a spatiotemporal context some of the tes that appear to work well in a purely cross-sectional framework. Moran s I is an obvious candidate to which we have added the spatial version of the BDS, the BP test of Brett and Pinkse and a Lagrange Multiplier. Below, the four tes are formally presented. 4.1. An extension of Moran index. The STMI test Moran s I (Moran, 1950) is one of the best-known tes for spatial noncorrelation The expression of the statistic resembles a classical correlation coefficient: R I = S 00 r s ( y y) w ( y y) r rs s R r= 1 ( y y) r where y is the sample mean, w rs is the (r,s) element of the weighting matrix, W, that is necessary to specify beforehand, and S 00 is a measure of overall connectivity for the R R s= r= geographical system ( S00 = 1 1 w ). If the matrix is row-standardized, then rs R=S 00. The null hypothesis of Moran s I is the absence of correlation between the spatial series { y ; = 1,,..., R } and i spatial lag { R s= 1w rs y s; r = 1,,..., R} r r (1), although the test is sensitive to other misspecification pitfalls (Anselin and Florax, 1995) and is frequently used as an overall specification measure. This test has interesting asymptotic properties (such as local invariance, efficiency, consistency and a normal asymptotic distribution) and, given a reasonable degree of connectedness, a good behaviour in case of finite sample sizes. As negative aspec, let us mention the assumption of normality, the absence of a well-defined alternative hypothesis, the need to specify a weighting matrix and the lack of robustness against several alternatives like non-normality, heteroskedasticity or parametric instability (see Cliff and Ord, 1981, Anselin and Rey, 1991, Tiefelsdorf, 000, Florax and de Graaff, 004, and Pinkse, 004, for different contributions to Moran s I). As stated, the statistic of (1) has good properties in a strictly spatial framework and can be easily generalized to deal with spatiotemporal relationships. As is evident from (1), Moran s I is the quotient between a quadratic form of a demeaned series on the W real matrix (say = [ y y] [ ]) Q R W y y, and a normalizing factor. Theorem 1 of Kelejian and Prucha (001) states that this quadratic form follows a central limit theorem such as: Q R Q R μ σ QR QR N(0, 1) () μ and σ being the first and second order momen of Q R. Q R 5 The result of () holds wherever the real valued random variable, after demeaning, has zero mean value and the matrix of the quadratic form has finite row sums (there are some other conditions, less strict, that may be consulted in Kelejian and Prucha, 001, pp. 6-7). The point that we would like to stress is that it is not strictly necessary to restrict the application of Moran s I to just one cross-section. If we have T consecutive cross-sections with R observations in each of them, stacked in a RT 1 vector, the following specification of the weighting matrix also admi the Theorem 1 of Kelejian and Prucha: W * TR W R 0 0 L I R W R L 0 I R W R L 0 0 = M M M M M M L W 0 R L I R W R The spatial weighting matrix, of order R R, obtained as usual (including the clause of null values in the main diagonal) appears along the main diagonal. The diagonal below the main diagonal contains the temporal weighting matrix, defined as the identity matrix of order R, I R. Obviously, the structure of temporal dependence can be extended downwards or upwards, in the same way, if necessary. Let us mention that a similar flexibility applies in relation to the spatial weighting matrix. This flexible framework allows, for example, to consider spatio temporal surroundings of location (t,s) of the following form: (, '),( t+ 1, s),( t+, s),...,( t+ m 1, s) where s is any spatial location, s a spatial neighbour of s, and t time. Under the null hypothesis of noncorrelation, the momen of the quadratic form are μ Q TR = 0 and σ = tr A *' * TRATR being ( W W ) 1 A * TR = * TR+ *' TR (more details in Kelejian and Prucha, 001 pp. 7 expression 3.). Given a spatio-temporal process {y } t Z, s S, where Z and S are se of time and spatial coordinates, with cardinality Z = T and S = R, respectively, the STMI statistic Q TR (3) appears to be STMI = ( y yw ) ( t 1) T + s, ( r 1) T + k ( yrk y) RT (, ) ( rk, ) S ( y y) 00 (4) 6 where y is the sample mean, w *, is the ( uv, ) element of the weighting matrix W R and uv S = w. 00 uv, uv,.. An extension of Brett and Pinkse test. The STBP test In this section, we extend the BP test of Brett and Pinkse (1997) to the case of spatiotemporal data. A test is proposed for spatiotemporal independence against spatiotemporal dependence of a fixed order , that is, a test for the spatiotemporal independence of individual observations from a finite number of other observations (hereafter referred to as proximate observations). As before, let {y } t Z, s S denote the sequence of interest. The y s can have continuous, discrete or mixed distributions, and the distributions functions may be unknown. Under the null hypothesis, the sequence is stationary and independent in space and time. In order for the STBP test to be consistent, the following two conditions are sufficient: (1) spatiotemporal dependence of a fixed order, () the sequence has to be strongly mixing 1. Strong mixing is a weak dependence condition. Intuitively, it states that dependence vanishes as the elemen in the sequence become infinitely far apart. The fixed order assumption means that if there is dependence under the alternative, then the test will reject the null hypothesis asymptotically; however, when the dependence involves observations that are not geographically proximate, the behavior of the test is undetermined. The BP test of Brett and Pinkse (1997) is based on the property that two distributions are identical if their characteristic functions are the same. This is to say, two variables are independent if the joint characteristic function factorizes into the product of the marginal characteristic functions. Following Brett and Pinkse (1997), we obtain the STBP test as follows. Let g be any practitioner-chosen density function with infinite support, and denote by iux hx ( ) = e duthe Fourier transformation of g. For a location s we denote by N s the set of neighbors of coordinate s. Now fix a positive integer m. Define m N = {(, t s'),( t+ 1, s),( t+, s), K,( t+ m 1, s) ; s' N } as the set of proximate s 1 For more information on strong mixing in time series and spatial contex see Rosenblatt (1956), Ibragimov and Linnik (1971) and Anselin (1988) 7 observations to location s in period t and n m N 1 = N i cardinality. Let y = n m rk N y rk be the sampling average of the proximate observations to y. The null hypothesis of the STBP test is Define where n and N H0 : y and y are independent for all t Z and s S. (5) h(, ) = h( y y ) and h(, ) = h( y y ). Introduce ηn 1, ηn and η n3 as NN N N η = n h h ; η = n h h ; η = n h h, NN 3 NN 4 NN n1 ( rk, ) ( rk, ) n ( rk, ) ( uv, ) n3 ( rk, ) ( uvpq, ), rk, rk, uv, rk, uv, pq = RT is the number of observations. Let η = ( η η ) + ( η η ) n n1 n n n3 m m ( N N rk ) ν = γ μ n n G n + n G rk G n ( n n) ( 0) rk ( ) ( ) rk 3 where μ = n h, γ = n h h and G () is an indicator function. n, n tu stu,, Under the null of independence, the extension of the Brett and Pinkse statistic is: nη n STBP = ν which is asymptotically χ 1 distributed. Few assumptions are required for the STBP test: stationarity and fixed order dependence are amongst the most strong. However, in a spatial setting, the test is not free from assuming some a priori knowledge about the spatial structure of dependencies. In these conditions, the test is consistent against all departures from the null hypothesis of independence (linear or nonlinear). On the negative side, the Brett and Pinkse test tends to, unduly, reject the null in the case of nonstationary series; this is a cause of concern in view of the frequent heterogeneity found in spatiotemporal data. Also, the test appears to be rather sensitive to the scaling of the observations as well as to the selection of the bandwidth corresponding to the parameter m..3. An extension of Brock, Dechert and Scheinkman test. The STBDS test Grassberger and Procaccia (1983) introduced the notion of correlation integral: (1) for a given scalar sequence { u u u } 1 T n,,..., and for established values of m and τ, obtain the τ set of m -histories: u ( m) = ( u, u τ,..., u ( 1) τ ); () compute the correlation integral : t t t+ t+ m 8 p 1 p mp τ, ε = ε t τ, k τ p( p) t= 1 k= t+ 1 C ( ) G ( u ( m) u ( m)) where p= T ( m) τ is the number of m histories with τ delay time that can be τ τ formed from T observations; and G ε is the indicator function so that G ( u ( m), u ( m)) takes the value 1 if both vectors are within distance ε of each other, and 0 otherwise. Thus, Grassberger and Procaccia (1983) suggested that one can distinguish nonlinear deterministic data from i.i.d. random data by calculating C τ ( ε ) for different p and testing whether the sequence evolves with p or converges to some fixed limit. This approach requires large quantities of data and no distribution theory is available for C τ mp, ( ε ). The idea behind BDS -type tes is to look for important deviations in the behavior of the correlation integral (6) from what is to be expected under i.i.d.. Particularly, if the data under consideration are i.i.d., then Matilla-García et al. (004) showed that τ ( C1 ) τ lim C ( ε ) = lim ( ε ) almost surely for all ε 0 and m =, 3, 4,... In p m, p p, p general, the statistic BDS τ mp, ( ε ) = C p m ( ε ) C ( ε ) τ τ mp, 1, p τ ˆ mp, ( ε ) σ has a limiting standard Normal distribution under the null of i.i.d., where σ ( ε ) refers mp, τ τ m to a consistent estimate of the asymptotic standard error of Cmp, ( ε ) C1, p( ε ). The classical BDS statistic of Brock et al. (1996) can be easily obtained from (7) by fixing τ = 1. For simplicity, in this paper, we choose the same value. Given the good properties of the BDS in a time series context, de Graaff et al. (001) adapted this test to a spatial framework, for τ = 1. The spatial counterpart of the concept of m -history can be labeled m -surrounding, and the particular ordering can be based on the concept of contiguity or distance. As a consequence of the difference in the number of possible drawings of m -surroundings, the p= T m + 1 in (7) should be replaced by R (number of spatial coordinates) in the spatial case. The spatial version of the BDS test ( SBDS form now on) is also a portmanteau test whose null hypothesis is m mp, ε t τ (6) k (7) As in Brock et al. (1996) we take x = max1 k m {x k }. 9 that the series proceeds from an iid data generating process. Under the null of independence, the statistic: SBDS is asymptotically N (0, 1) distributed. mr, ( ε ) C ( ε ) C ( ε ) m mr, 1, R = R (8) ˆ mr, ( ε ) In a spatiotemporal context, the dependencies within and between the spatial and temporal domains tend to be more complex as interaction can occur over time and over space. Furthermore, strong nonlinearities can arise as a result of the two sources of interaction. In order to detect any possible deviation from pure independence, either of temporal or spatial origin, we develop a spatiotemporal version of the standard BDS test. Let, as before, { y be a spatiotemoral process. The extension of the concept } t Z, s S of m-history, or m-surrounding, to a spatio-temporal framework is implemented as in the case of the STBP test: For each location s let σ N = {s, s,, s } be the set of s 1 m m 1 nearest neighbors of s. Now, given a coordinate s in period t, we define by N = {(, t s'),( t+ 1, s),( t+, s), K,( t+ m 1, s) ; s' N } as the set of proximate m1 1 observations to location s in period t, which generates a ( m1, m) -spatio-temporal surrounding of y of the form: s y ( m, m ) = ( y, y,, y, y, y,, y ) (9) 1 1 m 1 t+ 1s t+ s t+ m1s Then, we have the following spatiotemporal version of the BDS statistic STBDS ( m1, m), RT ( ε ) where the variance is estimated as follows ( m, m ), m1+ m C( m1, m), RT( ε) C1, RT( ε) = RT (10) ˆ σ ( ε ) ( m1, m), RT m m m j j m m = k + k c m c m kc 1 RT + j= 1 ˆ σ 4{ ( 1) } where m= m1+ m and k is RT RT RT k = [ G ( y, y ) G ( y, y ) + G ( y, y ) G ( y, y ) + G ( y, y ) G ( y, y )] ε r s ε s t ε r t ε t s ε s t ε r t RT ( RT )( RT ) r= 1 s= r+ 1 t= s+ 1 Notice that the rejection of the null means that data are compatible with a temporal and/or spatial dependence structure, but the test ielf will not be able to identify the nature of the dependencies. Again the STBDS test can be used as a general test for model misspecification, i.e., as a portmanteau test. 10 As occurs with the standard BDS and the SBDS, the behavior of the STBDS statistic will crucially depend not only on the free choice parameters ( m type and ε -type), but also on the sample size. The Monte Carlo experimen conducted by Brock et al. (1991) suggest that ε should be between 1/ and standard deviations of the data, since then the power of the BDS test is maximized. They also show that the sample size should be bigger than 00m with m no greater than 5 in order to assure a good approximation..4. A Lagrange Multiplier: the STLM test We complete the discussion by developing a standard Lagrange Multiplier for testing the hypothesis of spatiotemporal independence. The approach is as usual: we need to specify an hypothetical data generating process, write the likelihood function and obtaining the corresponding Multiplier. According to this scheme, the model that we are going to use is the

Related Search

Similar documents

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks