Authors: | Giulio Bottazzi and Angelo Secchi |
---|---|
Contact: | <giulio.bottazzi@sssup.it>, <angelo.secchi@sssup.it> |
Date: | 4 May 2007 |
Revision: | 0.1 |
Copyright: | GPL |
The program polyloc is written to estimate the model of firm location developed by G. Bottazzi together with G.Dosi, G.Fagiolo and A. Secchi (see References below). This program reads data from standard input in an ASCII format and prints the result in ASCII format both to standard output and in a directory named "results". Polyloc is written in Python and it makes use of two external libraries (see below). It has been tested (and used many times) on different Linux distributions and should run, in principle, on any Unix platform. No other OSes have been tested.
Please notice that these programs have been written for personal use, and are distributed under the GPL license (see the file COPYING) in the hope they could be of help to other people, but without any implied warranty.
Polyloc is based on Python, a dynamic object-oriented programming language that is renowned for its flexibility and simplicity. For details see http://www.python.org. Moreover polyloc requires two further libraries, Scipy and Numpy. For more informations about Scipy, including installation procedures on various platforms, check http://www.scipy.org/. You can find Numpy at http://www.scipy.numpy.org/.
The program polyloc does essentially a three step work:
- it reads a data file (see below for how to build the data file) from the standard input;
- it removes "spatial effects" if an additional file with locations type is provided (see below);
- it estimates the model requested and saves results in a directory results (see below for a description of the output files).
Polyloc accepts ASCII files. They must have geographical locations on their rows and different industrial sectors on columns. Name of locations are in double brackets and fields are comma separated. For example, suppose you have a location called "LOCATION1" which presents 1 firm (or employee or other relevant proxies) belonging to the first industrial sectors, 2 belonging to the second and 3 to the third and also a second location named "LOCATION2" which has 12 firms (or employee or other relevant proxies) in the first sector and none in the others. Then the corresponding data file looks like:
#Location, sector1, sector2, sector3 "LOCATION1",1,2,3 "LOCATION2",12,0,0
Lines beginning with the fence symbol # are ignored.
With option '-t', the program polyloc accepts a second input file providing additional informations on the "nature" of each location. In particular this file states if any given location belongs to district, large metropolitan areas or big cities in order to allow the user to remove their effects in the evaluation of agglomeration models. File with location types has the following structure (ORDER IS IMPORTANT!!!):
#Location, District, Metropolitan Area, City "LOCATION1",0,1,1 "LOCATION2",1,0,0
where "0" means "not belonging to" while "1" means "belonging to".
Results are printed both on the standard output and also saved in 3 files generated and saved in a directory named "results". Notice that polyloc check the existence of the directory "results" and generate it only if necessary.
The output for "model 1" and "model 2" are the estimated coefficients together with the AAD Average Absolute Deviation (for details cfr. the references below), a measure of the model goodness-of-fit. On the contrary "model 0" does not produce anything in the standard output (only corresponding files in the directory "results") since it does not imply any explicit estimation. Moreover polyloc generates for each model three different files:
results/model1_coef.txt : contains estimated coefficients with AAD for all the sectors provided results/model1_emp.txt : contains the occupancy class frequency computed on observed data. results/model1_theo.txt : contains the occupancy class frequency computed using the theoretical models with estimated parameters.
This file can be use to save results and produce sectoral plots (see the How to produce plots below).
Here the syntax to fit "Model 1" without removing any other geographical effects:
>python polyloc.py -m model1 -v < test.dat #============== Model Specification ============== Model to fit is : model1 Number of Locations : 784 Removing the effect of : None Initial conditions for the optimization : 1.00 1.00 #=================================================== ####### SECTOR 1 ####### Model 1 b = 1.17102499695 AAD= 0.0364294331994 ####### ########
If you want to estimate "Model 2" using a specific set of initial conditions:
>python polyloc.py -m model2 -i 0.8,0.8 -v < test.dat #============== Model Specification ============== Model to fit is : model2 Number of Locations : 784 Removing the effect of : None Initial conditions for the optimization : 0.80 0.80 #=================================================== ####### SECTOR 1 ####### Model 2 b = 1.17088822335 beta= 9.93460595477e-108 AAD= 0.0364293108742 Chi^2= 0.207784017963 ####### ########
If you like to remove other geographical effects (districts, metropolitan areas, cities) run:
>python polyloc.py -m model1 -t test_type.dat -e districts -v < test.dat #============== Model Specification ============== Model to fit is : model1 Number of Locations : 667 Removing the effect of : districts Initial conditions for the optimization : 1.00 1.00 #=================================================== ####### SECTOR 1 ####### Model 1 b = 1.25869624926 AAD= 0.0427225672437 ####### ########
Using files saved in the directory "results" it is possible to produce fancy plots comparing the occupancy class frequencies computed on observed data with the ones estimated using Model 0, Model 1 or Model 2 In the following we provide lines to generate plots with GNUPLOT, a portable command-line driven interactive data and function plotting utility for UNIX, IBM OS/2, MS Windows, DOS, Macintosh, VMS, Atari and many other platforms.
Here the relevant lines of gnuplot code:
Setting the stage:
gnuplot>reset gnuplot>set log y gnuplot>set title 'Sector 1' gnuplot>set style fill solid 1.0 gnuplot>set border 2 gnuplot>set ytics nomirror gnuplot>set xtics nomirror ("C_1" 0, "C_2" 1, "C_3" 2, "C_4" 3, "C_5" 4, "C_6" 5, "C_7" 6, "C_8" 7, "C_9" 8, "C_{10}" 9 )
Prepare data: the next three lines select the first column(i.e. the first sector) of the results and transpose it. They make use of GBUTILS, a set of utilities for the manipulation and statistical analysis of data freely available at http://www.sssup.it/~giulio/software/gbutils/index.html. Clearly one may use any other utilities able to do the same job:
gnuplot>system "gbget 'results/model1_emp.txt(,1)tD' > results/.temp" gnuplot>system "gbget 'results/model1_theo.txt(,1)tD' > results/.temp1" gnuplot>system "gbget 'results/model2_theo.txt(,1)tD' > results/.temp2"
Plotting Occupancy classes:
gnuplot>plot[:][:] "<gbget 'results/.temp'" u 0:1:(1) w boxes fs solid 0 lt -1 lw 1.75 title "Observed",\ "<gbget 'results/.temp1' " u ($0-0.2):1:(.3) w boxes fs solid 0.3 lt 1 title "Model 1",\ "<gbget 'results/.temp2' " u ($0+0.2):1:(.3) w boxes fs solid 0.8 lt 1 title "Model 2"
Generating an .eps output file and deleting temporary files:
gnuplot>set term post eps enha 'Times-Roman' 20 gnuplot>set output 'sector1.eps' gnuplot>replot gnuplot>set output gnuplot>set term x11 gnuplot>system "rm data/.temp*"
COPYING GNU Public License README this file AUTHORS the list of authors NEWS list of package modifications ChangeLog "" "" test.dat demo file test_type.dat demo file for locations type polyloc.py source code for general purpose function model0.py definition and estimation of "Model 0" model1.py definition and estimation of "Model 1" model2.py definition and estimation of "Model 2" polyloc.css stylesheet for HTML docutils