Polyloc -- Overview

Authors:	Giulio Bottazzi and Angelo Secchi
Contact:	<giulio.bottazzi@sssup.it>, <angelo.secchi@sssup.it>
Date:	4 May 2007
Revision:	0.1
Copyright:	GPL

Contents

Getting Started
References
Package files list

Getting Started

The program polyloc is written to estimate the model of firm location developed by G. Bottazzi together with G.Dosi, G.Fagiolo and A. Secchi (see References below). This program reads data from standard input in an ASCII format and prints the result in ASCII format both to standard output and in a directory named "results". Polyloc is written in Python and it makes use of two external libraries (see below). It has been tested (and used many times) on different Linux distributions and should run, in principle, on any Unix platform. No other OSes have been tested.

Please notice that these programs have been written for personal use, and are distributed under the GPL license (see the file COPYING) in the hope they could be of help to other people, but without any implied warranty.

Required libraries

Polyloc is based on Python, a dynamic object-oriented programming language that is renowned for its flexibility and simplicity. For details see http://www.python.org. Moreover polyloc requires two further libraries, Scipy and Numpy. For more informations about Scipy, including installation procedures on various platforms, check http://www.scipy.org/. You can find Numpy at http://www.scipy.numpy.org/.

Brief description of the program

The program polyloc does essentially a three step work:

it reads a data file (see below for how to build the data file) from the standard input;

it removes "spatial effects" if an additional file with locations type is provided (see below);

it estimates the model requested and saves results in a directory results (see below for a description of the output files).

Input Data File

Polyloc accepts ASCII files. They must have geographical locations on their rows and different industrial sectors on columns. Name of locations are in double brackets and fields are comma separated. For example, suppose you have a location called "LOCATION1" which presents 1 firm (or employee or other relevant proxies) belonging to the first industrial sectors, 2 belonging to the second and 3 to the third and also a second location named "LOCATION2" which has 12 firms (or employee or other relevant proxies) in the first sector and none in the others. Then the corresponding data file looks like:

#Location, sector1, sector2, sector3
"LOCATION1",1,2,3
"LOCATION2",12,0,0

Lines beginning with the fence symbol # are ignored.

With option '-t', the program polyloc accepts a second input file providing additional informations on the "nature" of each location. In particular this file states if any given location belongs to district, large metropolitan areas or big cities in order to allow the user to remove their effects in the evaluation of agglomeration models. File with location types has the following structure (ORDER IS IMPORTANT!!!):

#Location, District, Metropolitan Area, City
"LOCATION1",0,1,1
"LOCATION2",1,0,0

where "0" means "not belonging to" while "1" means "belonging to".

Results and output files

Results are printed both on the standard output and also saved in 3 files generated and saved in a directory named "results". Notice that polyloc check the existence of the directory "results" and generate it only if necessary.

The output for "model 1" and "model 2" are the estimated coefficients together with the AAD Average Absolute Deviation (for details cfr. the references below), a measure of the model goodness-of-fit. On the contrary "model 0" does not produce anything in the standard output (only corresponding files in the directory "results") since it does not imply any explicit estimation. Moreover polyloc generates for each model three different files:

results/model1_coef.txt        : contains estimated coefficients with AAD for all the sectors provided
results/model1_emp.txt         : contains the occupancy class frequency computed on observed data.
results/model1_theo.txt        : contains the occupancy class frequency computed using the theoretical models with estimated parameters.

This file can be use to save results and produce sectoral plots (see the How to produce plots below).

Examples

Here the syntax to fit "Model 1" without removing any other geographical effects:

>python polyloc.py -m model1 -v < test.dat

#==============  Model Specification  ==============

Model to fit is                           : model1

Number of Locations                       : 784

Removing the effect of                    : None

Initial conditions for the optimization   : 1.00 1.00

#===================================================


#######  SECTOR  1  #######

 Model 1

 b = 1.17102499695

 AAD= 0.0364294331994

#######             ########

If you want to estimate "Model 2" using a specific set of initial conditions:

>python polyloc.py -m model2 -i 0.8,0.8 -v < test.dat

#==============  Model Specification  ==============

Model to fit is                           : model2

Number of Locations                       : 784

Removing the effect of                    : None

Initial conditions for the optimization   : 0.80 0.80

#===================================================


#######  SECTOR  1  #######

 Model 2

 b = 1.17088822335 beta= 9.93460595477e-108

 AAD= 0.0364293108742 Chi^2= 0.207784017963


#######             ########

If you like to remove other geographical effects (districts, metropolitan areas, cities) run:

>python polyloc.py -m model1 -t test_type.dat -e districts -v < test.dat

#==============  Model Specification  ==============

Model to fit is                           : model1

Number of Locations                       : 667

Removing the effect of                    : districts

Initial conditions for the optimization   : 1.00 1.00

#===================================================

#######  SECTOR  1  #######

Model 1

b = 1.25869624926

AAD= 0.0427225672437

#######             ########

How to produce plots

Using files saved in the directory "results" it is possible to produce fancy plots comparing the occupancy class frequencies computed on observed data with the ones estimated using Model 0, Model 1 or Model 2 In the following we provide lines to generate plots with GNUPLOT, a portable command-line driven interactive data and function plotting utility for UNIX, IBM OS/2, MS Windows, DOS, Macintosh, VMS, Atari and many other platforms.

Here the relevant lines of gnuplot code:

Setting the stage:

gnuplot>reset
gnuplot>set log y
gnuplot>set title 'Sector 1'
gnuplot>set style fill solid 1.0
gnuplot>set border 2
gnuplot>set ytics nomirror
gnuplot>set xtics  nomirror ("C_1" 0, "C_2" 1, "C_3" 2, "C_4" 3, "C_5" 4, "C_6" 5, "C_7" 6, "C_8" 7, "C_9" 8, "C_{10}" 9 )

Prepare data: the next three lines select the first column(i.e. the first sector) of the results and transpose it. They make use of GBUTILS, a set of utilities for the manipulation and statistical analysis of data freely available at http://www.sssup.it/~giulio/software/gbutils/index.html. Clearly one may use any other utilities able to do the same job:

gnuplot>system "gbget 'results/model1_emp.txt(,1)tD' > results/.temp"
gnuplot>system "gbget 'results/model1_theo.txt(,1)tD' > results/.temp1"
gnuplot>system "gbget 'results/model2_theo.txt(,1)tD' > results/.temp2"

Plotting Occupancy classes:

gnuplot>plot[:][:] "<gbget 'results/.temp'" u 0:1:(1) w boxes fs solid 0 lt -1 lw 1.75 title "Observed",\
"<gbget 'results/.temp1' " u ($0-0.2):1:(.3) w boxes fs solid 0.3 lt 1 title "Model 1",\
"<gbget 'results/.temp2' " u ($0+0.2):1:(.3) w boxes fs solid 0.8 lt 1 title "Model 2"

Generating an .eps output file and deleting temporary files:

gnuplot>set term post eps enha 'Times-Roman' 20
gnuplot>set output 'sector1.eps'
gnuplot>replot
gnuplot>set output
gnuplot>set term x11
gnuplot>system "rm data/.temp*"

References

Bottazzi, G., G. Dosi, G. Fagiolo, A. Secchi (2007) "Modeling Industrial Evolution in Geographical Space", Journal of Economic Geography, forthcoming.
Bottazzi, G., G. Dosi, G. Fagiolo, and A. Secchi (2004) "Sectoral and Geographical Specificities in the Spatial Structure of Economic Activities" L.E.M. Working Paper 2004-21.

Package files list

COPYING        GNU Public License
README         this file
AUTHORS        the list of authors
NEWS           list of package modifications
ChangeLog              ""      ""
test.dat       demo file
test_type.dat  demo file for locations type

polyloc.py     source code for general purpose function
model0.py      definition and estimation of "Model 0"
model1.py      definition and estimation of "Model 1"
model2.py      definition and estimation of "Model 2"

polyloc.css    stylesheet for HTML docutils