Espace géographique
Belin

I.S.B.N.2701137306
96 pages

p. 61 à 68
doi: en cours

Veille sur la revue
Veille sur l'auteur
Vous consultez

Analyse spatiale

tome 33 2004/1

2004 Espace géographique Analyse spatiale

An Analytical Description of Spatial Patterns

Arthur Getis San Diego State University, Department of Geography, CA, USA Jean Paelinck George Mason University, School of Public Policy, VA, USA
The description of spatial patterns has always attracted the attention of spatial scientists, economists as well as geographers; this study takes up the problem once more but in a purely abstract fashion. On a reference area of a fractal nature one defines successively measures of concentration (or dispersion), of eccentricity, of randomness, of clustering; a visual presentation of the global results obtained can be realized through a geophenogram. Finally, two concrete examples are presented, one for five US States, the other for the twelve Dutch provinces; they show that the abstract framework developed can in fact be adapted in a flexible way to concrete situations encountered.Keywords : CONCENTRATION, ECCENTRICITY, GEOPHENOGRAM, PATTERN, RANDOMNESS. La description des morphologies géographiques a depuis toujours retenu l’attention des spatialistes, économistes aussi bien que géographes ; la présente étude reprend ce problème mais d’une façon purement abstraite. Sur une aire de référence de type fractal, l’on définit successivement des mesures de concentration (ou de dispersion), d’excentricité, du caractère aléatoire, de regroupement (clustering) ; une visualisation des résultats d’ensemble peut être obtenue à l’aide d’un géophénogramme. Finalement deux exemples concrets sont présentés, l’un pour cinq États américains, l’autre pour les douze provinces néerlandaises ; ils montrent que le cadre abstrait développé peut en fait être adapté de façon très souple aux situations concrètes rencontrées.Mots-clés : ALÉATOIRE, CONCENTRATION, EXCENTRICITÉ, GÉOPHÉNOGRAMME, PATRON.
 
1. Introduction
 
 
More than ever, spatial patterns are at the center of attention of geographers, economists, and regional scientists. An obvious example is the current concern for the spatial patterns of economic well-being, both within and among nations. Historically, well-known constructions of idealised spatial economic landscapes were created by Christaller (1933), Lösch (1954), and Tinbergen-Bos (Paelinck, 1999). In this paper, spatial patterns will be taken up in that spirit, but in a purely abstract way, without reference to the processes that could have produced them.
An overriding concern of a number of scholars over the years has been their attempts at differentiating one pattern from another, by deriving or describing various measures of shape, form, density, intensity, clustering, centrality, and dispersion (see the recent review by Wentz, 2000). The literature tends to emphasise a variety of subjects; among them are the patterns of towns, the clustering of diseases, the form of physical features, and the shape of economic regions. There are those, however, who have developed universal, systematic ways to measure two-dimensional patterns without reference to such subject areas; these include: Mandelbrot (1983) and fractals; Ahuja and Schachter (1983) and pattern models; Okabe, Boots, and Sugihara (1992) and tesselations; Getis and Boots (1978) and points, lines, and areas; Wentz (2000) and shape; Haggett, Cliff, and Frey (1977) and scale; Tobler (1979) and map projections; and Neft (1966) and measures of centrality. Our work follows in this tradition.
In this paper, we first introduce the nature of the reference area and its statistical distribution (Section 2) and then present a series of descriptive devices: measures of concentration or dispersion (Section 3), eccentricity (Section 4), randomness (Section 5), geophenograms (Section 6), and clustering (Section 7). We consider a spatial pattern as a multidimensional concept where each dimension requires a specific measuring rod. Our goal is to first present the theory for spatial patterns, in general, giving examples along the way, and then, in a subsequent paper, develop a focused approach in order to demonstrate how spatial pattern may be described within subsets of large data sets. The paper will close with a simple, more comprehensive example (Section 8) and some observations on the practical usefulness of the pattern measures (Section 9).
 
2. Reference area and statistical distribution
 
 
The reference area is the one introduced in Kuiper, Paelinck and Rosing (1990); in actuality, the shape of the area is made up of elementary squares, each of size one by one, around a central square. Figure 1 is a depiction of the reference area when the «radius»—the largest distance from the central square—equals 1; the general formula for the number of elementary squares, v, is a function of the radius r:
(1) v = 2r (r+1) +1
Fig. 1
Reference area of radius 1
IMGIMGReference area of radius 1IMGIMF
The choice of the particular reference area is that its principle, adding elementary squares, can be generalized to irregular shapes for the study of observed areas (see e.g. Kuiper, 1997). In addition, in the present context, the reference area offers the property of having a central square, c, which allows us to emphasise our particular spatial point of view; it has an odd number of elementary squares, and could, in later investigations, be complemented by an «even» type.
The following assumptions are introduced:
A1: spatial contiguity is defined as the rook’s case, which implies that the distance separating elementary squares having a side in common is one;
A2: n elementary units (or objects) can be freely dispersed among the n elementary squares, in the sense that all patterns to be described further down are equiprobable and independent.
In order to fix ideas, it will be helpful to envisage a number of units of one kind (firms, residences, trees, crimes, infected persons, etc.) placed, accordingly to A2, in the v squares (districts, residential blocks, pixels, etc.). The model is simple, but it is able to generate a large number of different spatial patterns.
A partition is described by the number of units in each of the squares. For each partition, there are one or more different arrangements of units in squares. Each different arrangement is called a pattern. The following shows the different types of patterns when n = 5 and v = 5; there are 27 different types in seven different partitions. We shall first use the convention of representing the contents of the cells, a, b, c, d, e, of Figure 1 as:
IMGIMGIMGIMF
The total number of spatial patterns is a function of the number of different ways one can divide n objects into n squares for each partition keeping in mind that the number of patterns for each partition is a function of the distance each square is from the central square. For example, the central square can contain objects (X) or it can be empty (0). The four squares one distance unit away from the central square (peripheral squares when r = 1) are related either by being diagonally next to each other or separated by the central square. Thus, when r = 1 there are four general types of spatial patterns; these are shown as:
IMGIMGIMGIMF
Thus, in order to identify the total number of patterns for each partition, one permutes the n objects in the arrangements shown. If a value of the same kind is placed in both a squares, spatially they cannot be differentiated from each other; that is, direction does not matter. For example, in the case discussed above, if the value 5 is placed in the central square, the resulting pattern counts as one pattern; if, however, the value 5 is placed in a peripheral square, that also constitutes one pattern. Thus, there are two possible patterns when all objects are placed in one cell. Again, if each object is different, in this case, there will remain just two possible patterns.
We contrast this approach with the combinatorial problem of determining the number of arrangements for each partition. Table 1 gives the number of spatial and non-spatial patterns for each partition. The formula for finding the number of arrangements (non-spatial) for each partition is given as:
(2)
where v1 is the number of cells having the largest partition, v2 the second largest, and so on. Thus, for the case n=5, v=5, and the partition 3,1,1, we have:

Table 1
Statistical distribution of patterns n = 5, v = 5
IMGIMGPartition Type (by units in squares)...IMGIMF
Partition Type (by units in squares) Spatial Patterns Non-Spatial Patterns A: 1 in each of 5 1 1 B: 1 in each of 3, 2 in 1 4 20 C: 1 in 1, 2 in each of 2 6 30 D: 1 in each of 2, 3 in 1 6 30 E: 2 in 1, 3 in 1 4 20 F: 1 in 1, 4 in 1 4 20 G: 5 in 1 2 5 Total 27 126

If one considers the presence of 3 objects or more in a cell as representing a spatial concentration, then the number of realizations is 16 (Types D, E, F, G) out of the 27 possible patterns or 59%.
 
3. Concentration and Dispersion
 
 
A possible indicator of concentration or dispersion of a spatial pattern can be defined as follows:
(3) cp = Σj Δjc (max Σi Δic)-1, p∈P
where P is the set of all possible patterns, p, and the Δjc’s are the absolute differences between the cell values within one unit of distance from the central square (index c); the indicator has values in the closed interval [0,1]: indeed, for complete dispersion, it takes on a value of zero, and for total concentration in one cell the value n is divided by itself. Thus, the range, 0 to 1, represents a useful index of concentration or dispersion.
If distance were not a factor in this system, then location over the cells is not relevant to the calculation; in this case, for a given n and v, all pattern types will have the same value of cp. Table 2 gives the value of cp for each pattern type given above.

Table 2
Values of cp
IMGIMGType	cp A	0.0 B	0.4 C	0.6 D	0.7 E	0....IMGIMF
Type cp A 0.0 B 0.4 C 0.6 D 0.7 E 0.8 F 0.9 G 1.0

It can be proved that cp equals the Gini concentration coefficient; indeed, the Gini coefficient can be written as:
(4) γ = v(v-1) - 2 Σi [i (i-1)]-1
where the sum is taken over v-1 cumulative terms of the Gini concentration graph.
 
4. Eccentricity
 
 
Another possible indicator of pattern is the center of gravity, although one could argue that the center of the reference area is the center of gravity, despite the fact that most units can be peripheral to it. We prefer an indicator that takes into account the mass-weighted average distance to the center of the reference area (central cell, index c) divided by the maximal measure:
(5) epf = Σi μ i dic (max Σ i μ i dic)-1, p∈P, f∈Fp
which is defined over the closed interval [0,1]; p and P have been defined above. The values μ i are the number of «outliers» in a partition f of a given type within a pattern p; table 3 gives the values of epf for each of the spatial partitions identified above in Section 2.

Table 3
Values of ep
IMGIMGType	Values of ep for each partition...IMGIMF
Type Values of ep for each partition in the order given in Section 2 A 0.8 B 0.6 0.8 0.8 1.0 C 0.6 0.6 0.8 1.0 0.8 1.0 D 0.4 0.4 0.8 0.8 1.0 1.0 E 0.4 0.6 1.0 1.0 F 0.2 0.8 1.0 1.0 G 0.0 1.0

From this it is evident that there can be a wide range of eccentricity values for each pattern type; this is to be expected given that the measure is a function of the relative position of the more or less concentrated units.
 
5. Randomness
 
 
Some of the patterns may be described as more random than others. As an indicator, we have chosen the relative number of non-zero parameters—minus one—necessary to compute an appropriate polynomial in two variables, x and y, over its maximum— again minus one. The x and y are the coordinates of the centers of the elementary squares, the central square, c, being positioned at (0,0); as before, the distance between the centers of the neighbouring cells is one. The rationale for this choice is taken from Chaitin (1975; see also Wolfram, 2002, p. 552 a.f.); it rests on the idea that «randomness» should be defined as the complexity of the equation representing the observed values. This indicator also ranges between 0 and 1; the higher the value, the more random the pattern. The examples in Table 4 illustrate the workings of this process. The computed mass of a cell is given as mi; the index of randomness is denoted as rpf, the extra subscript, f, implying again that the index differs according to the partition within the pattern type. So:
(6) rpf = (np - 1)(v - 1)-1
where np is the required number of non-zero parameters, and v is defined by equation (1).

Table 4
Randomness equations
IMGIMGPattern (a b c d e )	mi	rpf 1 1 1 1 ...IMGIMF
Pattern (a b c d e ) mi rpf 1 1 1 1 1 1 0 0 0 5 0 0 5 (1 - x2 - y2) 0.5 0 2 1 2 0 1 + x2 - y2 0.5 2 2 1 0 0 1 - x + y 0.5 1 2 1 0 1 - x 0 1 2 1 1 0 1 - 0.5x + 0.5x2 + 0.5y - 0.5y2 1.0

The last two examples show that the relative positioning of the units strongly influences the degree of randomness of the pattern.
It is true that for larger v, one has to solve large systems of linear equations, but with present computational facilities this would not be a problem.
 
6. Geophenogram
 
 
The results obtained e.g. for the 02120 pattern could be set out in a circle that characterizes the pattern (Figure 2); we have termed that figuration a geophenogram or spatial pattern descriptor. The geophenogram has at its center zero values, while the circumference represents the value one.
Fig. 2
Geophenogram
IMGIMGGeophenogramIMGIMF
One could imagine representing numerically such a pattern; a possible measure could be the surface of the typical triangles divided by the maximum value which is 1.3 for unit radius. For sides of length a, b, and c, using Heron’s formula, one has:
(7) Area = √[s(s-a)(s-b)(s-c)]
where s = (a+b+c)/2, and, therefore, for the pattern 02120, the measure would be .393.
Another possibility is the length of the sides of those triangles starting from the center, divided by its maximum, i.e., three. For the 02120 pattern, this value is .633.
An easy measure to compute is the average score (which is already normalized to one) and the coefficient of variation, of which the maximum value is √3; this can be proved as follows.
The coefficient of variation is defined as:
(8) v= Ï/μ
the square of which can be expanded, using definitions of Ï and μ, and ni being the number of indicators considered, to:
(9) v2 = ni (Σixi2)(Σixi2 + Σijxixj )-1
and, given non-negativity in the case of the indicators discussed, this quantity reaches a maximum for Σijxixj = 0, implying that only one indicator is non-zero. For the 02120 pattern this value is 1.039.
Differences between patterns can be expressed as normalized Hamming distances, defined as the sum of absolute differences between indicators divided by their number. Again, for the 02120 pattern, with respect to the type A pattern, this value is .617.
For all of these synthetic indicators, it should be noted that the same value may correspond to different underlying joint patterns, as is well known from the Gini coefficient.
 
7. Clusters
 
 
Here we focus on the central square and take distances, d, from the central square outward to r. The clustering (or non-clustering) measure, kpd, is defined as the proportion of the sum of cell values within d (d ≤ r) of the central square:
(10) kpd = Σj xj (d) / Σj xj
When all cell values are found within the v squares, the v squares constitute a cluster of one. The measure resides in the closed interval [0,1]; clearly, as d approaches r, the probability is high that the measure will approach one, thus, the measure is only useful when d is considerably less than r (see Getis and Ord, 1992, for a test of significance on the hypothesis of no clustering). In Table 5, we show the measure for distance 0 from the central square, i.e., only the value in the central square enters the numerator; clearly, those partitions with the highest values of kp0 are those with the highest values in the central squares.

Table 5
Values of kpd
IMGIMGType	Values of kpd for each partitio...IMGIMF
Type Values of kpd for each partition in the order given in Section 2 A 0.2 B 0.4 0.2 0.2 0.0 C 0.4 0.4 0.2 0.0 0.2 0.0 D 0.6 0.6 0.2 0.2 0.0 0.0 E 0.6 0.4 0.0 0.0 F 0.8 0.2 0.0 0.0 G 1.0 0.0

 
8. Some examples
 
 
Suppose first we wish to know whether the partition of cities greater than 100,000 population in the Rocky Mountain states, including Colorado and its neighbours on the north (Wyoming), south (New Mexico), east (Kansas), and west (Utah) reveals interesting spatial pattern characteristics. Let us consider the rook’s case with distances unity between Colorado and each neighbour. The number of towns of the requisite size, in 1990, is 10: 4 for Colorado, 4 for Kansas, 1 each for Utah and New Mexico, and 0 for Wyoming.
There are 30 different partition types for 10 cities in 5 states. The total number of spatial patterns is 174 (the non-spatial total is 1001). For the partition 4 4 1 1 0, there are 6 spatial patterns (04411, 14401, 44101, 44110, 41104, 41014). Of the 174 spatial patterns, those having at least 5 cities in one state is 106 or 61%. Thus, it is not unusual for one state to have 4 cities, but only 12 of 174 patterns (6.9%) have exactly two states each having 4 cities in them.
The index of concentration, cp , is 0.25, indicating a relatively high level of dispersion.
Eccentricity for the observed pattern of cities is 0.60, a moderate value indicating that there is a tendency for the areas around Colorado (especially Kansas) to draw the intensity of the pattern away from Colorado.
As far as clustering is concerned, having four cities in Colorado, of the ten in the region under study, indicates a tendency for clustering but, as mentioned above, such a pattern is not unexpected.
As a second example, the regional products (106 Dfl, at market prices) in the 12 Dutch provinces will be submitted to an analysis; Table 6 presents the data (source: Dutch Central Statistical Office, Statline).

Table 6
Regional product of the Dutch provinces (1997)
IMGIMGProvinces	GRP Groningen	30.7 Friesla...IMGIMF
Provinces GRP Groningen 30.7 Friesland 22.0 Drente 16.8 Overijssel 41.4 Flevoland 9.1 Gelderland 74.7 Utrecht 52.2 Noord-Holland 119.3 Zuid-Holland 164.2 Zeeland 10.9 Noord-Brabant 105.0 Limburg 46.1

The index of concentration, cp, is calculated to be 0.4657. The computed value of the eccentricity indicator, epf, is 0.4456; in order to derive this value, the central «elementary square» was located at the province Utrecht, which has the lowest total distance to the other provinces, those distances having been measured «as the crow flies» between provincial capital cities (a better distance measure, e.g. a Hausdorff distance, would take into account more locations internal to the provinces). The clustering indicator, kpd, is computed to be 0.810 for 7 out of 12 provinces lying within half the maximal observed distance from the central unit, but this value is not statistically significant (Getis and Ord’s (1992) test reveals Z=+1.88). Finally, the randomness indicator, rpf, has the value 1, the crude data as well as in a stylised version, where a reference area with radius 2 (the central square not being used) is introduced together with five classes of GRP. The provinces were allocated to squares according to a rough nearest neighbour assignment (for demonstration purposes).
This example shows that even outside the straitjacket of a stylised reference area, analytical descriptive results can be obtained from various statistical sources at the spatial level.
 
9. Conclusions
 
 
Clearly, there are a large number of measures that can be devised to index a spatial pattern. In this paper, the authors have presented measures of concentration and dispersion, eccentricity, randomness, and clustering most of which identify various aspects of spatial autocorrelation structures (Kaashoek et al., 2004). Another possible entry is the study of certain properties of distance geometry (Blumenthal 1970), such as the in-betweenness aspect of elements of the spatial pattern, leading up to the construction of sets of passing and terminal points.
An obvious next step in this work is to identify problems that might arise when r —and therefore v—is a large number, and to treat multidimensional observations. Applications in various endeavors in regional science would contribute to a better understanding of the nature of spatial patterns.
The authors thank an anonymous referee for numerous stimulating remarks, most of which have been taken into account in this revised version.
 
BIBLIOGRAPHIE
 
·  Ahuja N. and Schachter B.J. (1983). Pattern Models. New York: Wiley.
·  Blumenthal L.M. (1970, second edition). Theory and Applications of Distance Geometry. Bronx, New York: Chelsea Publishing Company.
·  Chaitin G.J. (1975). «Randomness and Mathematical Proof». Scientific American, vol. 232, no 5, p. 47-52.
·  Christaller W. (1933). Die zentralen Orte in Süddeutschland. Jena: Fischer. (Translated by C. Baskin, 1966, The Central Places of Southern Germany. Englewood Cliffs: Prentice-Hall).
·  Getis A. and Boots B. (1978). Models of Spatial Processes: An Approach to the Study of Points, Lines, and Area Patterns. Cambridge: Cambridge University Press, Cambridge Geographical Studies 8.
·  Getis A. and Ord J.K. (1992). «The analysis of spatial association by use of distance statistics». Geographical Analysis, no 24, p. 189-206.
·  Haggett P., Cliff A.D. and Frey A. (1977). Locational Methods. London: Edward Arnold.
·  Kaashoek J.F., Paelinck J.H.P. and Zoller H., eds. (2004). «On Connectivity», in A. Getis, J. Mur and H. Zoller, Spatial Statistics and Econometrics. London: Palgrave.
·  Kuiper J.H. (1997). «General Commuting Models Compared». EGI-Onderzoekspublicatie 39. Erasmus University Rotterdam, Economic Geography Institute, and additional handwritten notes.
·  Kuiper J.H., Paelinck J.H.P. and Rosing K.E. (1990). «Transport Flows in Tinbergen-Bos Systems», in K. Peschel (ed.) Infrastructure and the Space-Economy, p. 29-52. Heidelberg-Bonn: Springer Verlag.
·  Lösch A. (1940). Die räumliche Ordnung der Wirtschaft. Jena: Fischer (translated by W.H. Woglom and W.F. Stolper, 1954, The Economics of Location. New Haven: Yale).
·  Mandelbrot B.B. (1983). The Fractal Geometry of Nature. San Francisco: W.H. Freeman and Company.
·  Neft D.S. (1966). «Statistical Analysis for Areal Distributions». Monograph Series 2, Philadelphia: Regional Science Research Institute.
·  Okabe A., Boots B. and Sugihara K. (1992). Spatial Tesselations: Concepts and Applications of Voronoi Diagrams. New York: Wiley.
·  Paelinck J.H.P. (with the assistance of J.-P. Ancot and J.H. Kuiper) (1983). Formal Spatial Economic Analysis. Aldershot: Gower.
·  Paelinck J.H.P. (1988). «Périphéricité: aspects théoriques». Revue d’Économie Régionale et Urbaine no 1, p. 7-14.
·  Paelinck J.H.P. (1999). «Tinbergen-Bos Systems: A Compendium of Recent Research Results». Working Paper.
·  Tobler W. (1979). «A transformational view of cartography». The American Cartographer, vol. 6, no 2, p. 101-106.
·  Wentz E.A. (2000). «A shape definition for geographic applications based on edge, elongation, and perforation». Geographical Analysis, vol. 32, no 2, p. 95-112.
·  Wolfram S. (2002). A New Kind of Science. Champaign, IL: Wolfram Media, Inc.
© Cairn 2007 Vie privée | Conditions d’utilisation | Conditions générales de vente
À propos | Éditeurs | Bibliothèques | Aide à la navigation | Plan du site | Raccourcis
Reference area of radius 1
Geophenogram