Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many improvements #1

Open
wants to merge 51 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
5ffdffb
Cleaned up README.md
r-barnes Mar 10, 2018
02dcc3f
Make executable
r-barnes Mar 10, 2018
fa8111d
Merge branch 'master' of github.com:pnklein/district
r-barnes Mar 10, 2018
e466181
Add necessary header
r-barnes Mar 23, 2018
628644f
Add better error checking and command line argument interface
r-barnes Mar 23, 2018
44c671a
Make debugging message more helpful.
r-barnes Mar 23, 2018
847dbfc
Initial import
r-barnes Mar 23, 2018
10ac515
Added extract_district_boundaries.py
r-barnes Mar 24, 2018
8f007bb
Ignore *.o
r-barnes Mar 24, 2018
8a94812
Formatting. ELiminated mixed tabs/spaces.
r-barnes Mar 24, 2018
bb7493f
Fixed indenting: tabs and spaces should not be mixed.
r-barnes Mar 24, 2018
e93b4bf
Use const in the function definition to protect pointer inputs
r-barnes Mar 24, 2018
8507940
Use const for `population` variable.
r-barnes Mar 24, 2018
9194435
Added rationale commenting and used `at` for improved access safety o…
r-barnes Mar 24, 2018
117d2f3
Resolve a sign warning
r-barnes Mar 24, 2018
17cde13
Resolve sign warnings. Switch to using `std::`
r-barnes Mar 24, 2018
ec05f7d
Compiling with `-g` doesn't make code slower, so it's a good thing to…
r-barnes Mar 24, 2018
18aadce
Eliminate unused variables to suppress warnings
r-barnes Mar 24, 2018
6815ca6
Resolved signedness warnings
r-barnes Mar 24, 2018
3ca5b0f
Fixing whitespace
r-barnes Mar 26, 2018
0c6f020
Generate multiple district plans
r-barnes Mar 26, 2018
eff56ef
Set random seed from time.
r-barnes Mar 26, 2018
f2bc4bc
Fixed bug
r-barnes Mar 26, 2018
d3b96c9
Using more useful names
r-barnes Apr 2, 2018
34b7dda
Output districts rather than sadly broken polygons
r-barnes Apr 2, 2018
c9e951c
Extract districts
r-barnes Apr 2, 2018
28efcb0
Handle possibility of multithreading
r-barnes Apr 2, 2018
fd3449e
Ignore executables and outputs
r-barnes Apr 2, 2018
384bca5
Fix script for running within other scripts
r-barnes Apr 2, 2018
06a94e3
Make RUN.sh use scripts from its own directory
r-barnes Apr 5, 2018
2d8ca50
Compile using standard CXX flag
r-barnes Apr 5, 2018
2ce5ad1
Add file extensions
r-barnes Apr 5, 2018
8d0667f
Changes to drop std to c++11 from c++1z
r-barnes Apr 5, 2018
1c929c5
Ignore *.exe
r-barnes Apr 5, 2018
609d3f1
Ditch matplotlib dependency
r-barnes Apr 5, 2018
9624c26
Drop matplotlib dependency
r-barnes Apr 5, 2018
beeba76
Fix file path
r-barnes Apr 5, 2018
a99f8b3
Don't delete temporary files. Might be needed to recover valuable work
r-barnes Apr 6, 2018
1091549
Endeavour to clean polygons prior to intersecting them
r-barnes Apr 6, 2018
197c443
Clean polygons before calculating intersection
r-barnes Apr 6, 2018
a068196
Try harder to converge
r-barnes Apr 6, 2018
9dae723
Add command-line argument checking
r-barnes Apr 6, 2018
8d47d19
Add link-time optimization
r-barnes Apr 6, 2018
576f158
Merge branch 'master' of github.com:r-barnes/district
r-barnes Apr 6, 2018
47de671
Moved license info to its own file.
r-barnes Apr 6, 2018
735cf42
Added better instructions for using the code
r-barnes Apr 6, 2018
673c3b9
Clarified WKT format.
r-barnes Apr 6, 2018
3809828
Formatting
r-barnes Apr 15, 2018
d0cab2b
Fix random number generation so it uses a better seed
r-barnes Apr 15, 2018
84be299
Add script header
r-barnes Apr 15, 2018
f726ed1
Shuffle input to improve randomness
r-barnes Apr 16, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
*.o
do_redistrict
out_*
*.exe
24 changes: 24 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
Note that the file code incorporates an adapted version of CS2, Andrew Goldberg
and Boris Cherkassky's implementation of a min-cost flow algorithm due to
Goldberg. See cs2-COPYRIGHT for the license information, and also see
cs2-README.

The rest of the code is subject to the following license:
Copyright 2017 Philip N. Klein and Vincent Cohen-Addad

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
141 changes: 88 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,75 @@
Note that the file code incorporates an adapted version of CS2, Andrew
Goldberg and Boris Cherkassky's implementation of a min-cost flow
algorithm due to Goldberg. See cs2-COPYRIGHT for the license
information, and also see cs2-README.
Balanced Centroidal Power Diagram Redistricting
===============================================

The rest of the code is subject to the following license:
Copyright 2017 Philip N. Klein and Vincent Cohen-Addad
Compilation
-----------

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
Compile the code by running:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
make

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Running the Code: Short Version
-------------------------------

The code can be run using:

./RUN.sh <Population File> <State Name> <States Outline> <District Num> <Output Name>

where

* `<Population File>` is a census block shapefile. These have names like `tabblock2010_44_pophu` (see below).
* `<State Name>` is the FIPS code of the state to be processed. In this example, this is `44`.
* `<States Outline>` is the name of a state boundary shapefile. This is a file like `cb_2016_us_state_500k` (see below).
* `<District Num>` is the number of districts to generate
* `<Output Name>` is the name of the output file.

Output will a representation of the districts in well-known text (WKT) format.
This is suitable for interoperation with many standard GIS programs.



Running the Code: Long Version
------------------------------

Extract census blocks and populations
python3 read_census_blocks.py <input directory name> <output_filename>
where <input directory name> contains shape file specifying census blocks, e.g.
abblock010_44_pophu/tabblock2010_44_pophu

python3 read_census_blocks.py <input directory name> <output_filename>

where `<input directory name>` contains shape file specifying census blocks,
e.g.

abblock010_44_pophu/tabblock2010_44_pophu

which can be downloaded from https://www.census.gov/geo/maps-data/data/tiger-data.html
(Select Population & Housing Unit Counts -- Blocks, then select a state.)

The output file written has one line per client point.
It specifies the x coordinate (longitude), the y coordinate
(latitude), and the population assigned to that point.
The script selects the point to be the centroid of the census block
shape. (WHAT HAPPENS IF THE SHAPE CONSISTS OF MULTIPLE POLYGONS?)
The output file written has one line per client point. It specifies the x
coordinate (longitude), the y coordinate (latitude), and the population assigned
to that point. The script selects the point to be the centroid of the census
block shape. (WHAT HAPPENS IF THE SHAPE CONSISTS OF MULTIPLE POLYGONS?)

Also, extract the boundary polygons of a state:
python3 read_state_shapefile.py <ST> <input directory name>
where <ST> is the two-letter abbreviation for a state, and <input directory name>
is the name of a directory (not including suffix) giving shape records for
state boundaries, e.g. cb_2016_us_state_500k as downloaded from https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html

python3 read_state_shapefile.py <ST> <input directory name>

here `<ST>` is the two-letter abbreviation for a state, and `<input directory name>`

is the name of a directory (not including suffix) giving shape records for state
boundaries, e.g. cb_2016_us_state_500k as downloaded from
https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html

Next, compute the clustering using
do_redistrict <k> <input_filename>
where the first argument is the number of clusters to find, and the
input file is in the format of the output of the read_census_blocks.py script.
This program sends some text indicating progress to standard err, and,
when it terminates, sends the output to standard out.
Output format:

do_redistrict <k> <input_filename>

where the first argument is the number of clusters to find, and the input file
is in the format of the output of the `read_census_blocks.py` script. This program
sends some text indicating progress to standard err, and, when it terminates,
sends the output to standard out.

Output format:

<num centers> <num clients>
<center x> <center y> <center z>
<center x> <center y> <center z>
Expand All @@ -54,29 +85,33 @@ when it terminates, sends the output to standard out.
Standard out should be piped into a file

Next, use
python3 Voronoi_boundaries.py <input filename> <output filename>
to produce a file that specifies:
the client points, with colors reflecting the assignment to
centers, and the boundaries of the convex polygons that form the
power diagram of the chosen centers.

python3 Voronoi_boundaries.py <input filename> <output filename>

to produce a file that specifies: the client points, with colors reflecting the
assignment to centers, and the boundaries of the convex polygons that form the
power diagram of the chosen centers.

Format:
<num centers> <num clients>
<center x> <center y> <color>
<center x> <center y> <color>
.
.
<center x> <center y> <color>
<x> <y> <x> <y> ... <x> <y>
<x> <y> <x> <y> ... <x> <y>
.
.
<x> <y> <x> <y> ... <x> <y>

<num centers> <num clients>
<center x> <center y> <color>
<center x> <center y> <color>
.
.
<center x> <center y> <color>
<x> <y> <x> <y> ... <x> <y>
<x> <y> <x> <y> ... <x> <y>
.
.
<x> <y> <x> <y> ... <x> <y>

Next, use
python3 plotGNUPlot.py <input filename> <boundary filename> <output GNUplot file>
where <boundary filename> is the name of a file specifying the
boundaries of the state, given in the format

python3 plotGNUPlot.py <input filename> <boundary filename> <output GNUplot file>

where <boundary filename> is the name of a file specifying the boundaries of the state, given in the format

<x> <y>
<x> <y>
.
Expand All @@ -86,16 +121,16 @@ where <boundary filename> is the name of a file specifying the
<x> <y>
.
.
<empty line>
<empty line>
<x> <y>
<x> <y>
.
.

where each sequence of x-y lines specifes the coordinates of polygon
vertices of some polygon that is part of the boundary of the state.
where each sequence of x-y lines specifes the coordinates of polygon vertices of
some polygon that is part of the boundary of the state.

The output is a file with gnuplot commands. Running gnuplot on that
file shows the client points according to the color assignment, and
also the state boundaries, and also the boundaries of the
power-diagram cells (clipped against the state boundaries).
The output is a file with gnuplot commands. Running gnuplot on that file shows
the client points according to the color assignment, and also the state
boundaries, and also the boundaries of the power-diagram cells (clipped against
the state boundaries).
41 changes: 41 additions & 0 deletions RUN.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#!/bin/bash

if [ "$#" -ne 5 ]; then
echo "Syntax: $0 <Population File> <State Name> <States Outline> <District Num> <Output Name>"
exit 1
fi

#Get the script's directory
dir=$(cd -P -- "$(dirname -- "$0")" && pwd -P)

pop_file=$1
state_name=$2
state_shapefile=$3
district_num=$4
output_name=$5

rand_str=`tr -dc A-Za-z0-9_ < /dev/urandom | head -c 20` #Prevents issues when using parallelism
temp_out_pop="temp_census_data_${state_name}_${rand_str}"
temp_out_power="temp_power_diag_${state_name}_${rand_str}"
temp_out_state="temp_state_boundary_${state_name}_${rand_str}"
temp_out_voronoi="temp_voronoi_boundary_${state_name}_${rand_str}"
temp_out_gnuplot="temp_gnuplot_${state_name}_${rand_str}"

echo "Reading census blocks..."
python3 $dir/read_census_blocks.py $pop_file $temp_out_pop

echo "Reading state boundaries..."
python3 $dir/read_state_shapefile.py $state_name $state_shapefile > $temp_out_state

echo "Generating power diagram..."
$dir/do_redistrict.exe $district_num $temp_out_pop > $temp_out_power

echo "Generating Voronoi boundaries..."
python3 $dir/Voronoi_boundaries.py $temp_out_power $temp_out_voronoi

echo "Extracting districting boundaries..."
python3 $dir/extract_district_boundaries.py $temp_out_voronoi $temp_out_state > "$output_name"
#python3 plotGNUPlot.py $temp_out_voronoi $temp_out_state $temp_out_gnuplot False
#gnuplot < $temp_out_gnuplot

#rm -f $temp_out_pop $temp_out_power $temp_out_state $temp_out_voronoi $temp_out_gnuplot
19 changes: 3 additions & 16 deletions Voronoi_boundaries.py
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
import numpy as np
import matplotlib.pyplot as plt
import sys
import scipy.spatial as sp
import shapely.geometry as sg
from matplotlib import colors as mcolors
color_dict = dict(mcolors.BASE_COLORS, **mcolors.CSS4_COLORS)
# colors = [x for x in color_dict if x not in {"w",'aliceblue','antiquewhite','azure','beige','bisque','blanchedalmond'}]

colors = [
'red', #ff0000 = 255 0 0
'web-green', #00c000 = 0 192 0
Expand Down Expand Up @@ -139,8 +136,6 @@ def PlotAll(C, A, assignment, bounded_regions, bbox, output):
f.write("\n") #x, y, color = 'black')
Plot_extra_lines(C, f)
f.close()
# plt.axis([bbox[0][0],bbox[1][0], bbox[0][1],bbox[1][1]])
# plt.show(block=True)



Expand Down Expand Up @@ -200,20 +195,12 @@ def find_proj(bounded_regions):
proj_regions[i].append(proj_point)
return proj_regions

def plot_regions(proj_regions):

for r in proj_regions:
if proj_regions[r] == []: continue
region = proj_regions[r]
convex_hull = sg.MultiPoint(region).convex_hull
x,y = convex_hull.exterior.xy
plt.plot(x, y, color = 'black')


def Plot_extra_lines(C,f):
diagram = sp.Voronoi(C)



def unbounded(input_region): return any(x==-1 for x in input_region)
## insert points to remove
Expand Down
4 changes: 2 additions & 2 deletions check_weights.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ using namespace std;

bool check_weights(const vector<Point> & clients, const vector<Point> & centers, const Assignment & assignment, const vector<double> & weights){
const double tolerance= 1e-6;
for (int i = 0; i < clients.size(); ++i){
for (unsigned int i = 0; i < clients.size(); ++i){
for (AssignmentElement ae : assignment[i]){

double client_to_assigned_center_weighted_dist_sq = centers[ae.center].dist_sq(clients[i]) + weights[ae.center];
for (int j = 0; j < centers.size(); ++j){
for (unsigned int j = 0; j < centers.size(); ++j){

if (client_to_assigned_center_weighted_dist_sq > centers[j].dist_sq(clients[i]) + weights[j]+tolerance){
cerr << "ERROR: client " << i << " closer to center " << j << " than to assigned center " << ae.center << "\n";
Expand Down
24 changes: 23 additions & 1 deletion do_redistrict.cpp
Original file line number Diff line number Diff line change
@@ -1,11 +1,20 @@
#include <iostream>
#include <fstream>
#include "assignment.hpp"
#include "random.hpp"
#include "redistrict.hpp"
#include "print_out_solution.hpp"

using namespace std;

int main(int argc, char *argv[]){
seed_rand(0);

if(argc!=3){
cout<<"Syntax: "<<argv[0]<<" <Number of Districts> <Population Data>"<<std::endl;
return -1;
}

int num_centers = atoi(argv[1]);
// string client_filename = argv[2];
std::ifstream inf(argv[2]);
Expand All @@ -19,7 +28,20 @@ int main(int argc, char *argv[]){
populations_vec.push_back(population);
}
}
auto [centers, assignment, weights ] = choose_centers(clients, &populations_vec[0], num_centers);

//Shuffle centers
for(unsigned int i=0;i<clients.size()-2;i++){
const int j = uniform_rand_int(i,clients.size()-1);
std::swap(clients[i], clients[j]);
std::swap(populations_vec[i], populations_vec[j]);
}

std::vector<Point> centers;
Assignment assignment;
std::vector<double> weights;

choose_centers(clients, &populations_vec[0], num_centers, centers, assignment, weights);

if (centers.size() == 0){
cout << "FAILURE TO CONVERGE\n";
return -1;
Expand Down
Loading