Introducing shp2svg

    A new open-source tool for converting GIS shapefiles into SVGs.

    Anthony Pesce Los Angeles Times
    Map: The Times' U.S. county-by-county map for the 2012 presidential election. It was rendered with Raphael, and the county shapes were converted to SVG from a shapefile.

    One of the lar­ger pro­jects the Data Desk tackled this year was visu­al­iz­ing the 2012 pres­id­en­tial elec­tion res­ults. First, we built county-by-county res­ults maps of each state for the GOP primary. Then, for the gen­er­al, we had a more soph­ist­ic­ated present­a­tion with a state-by-state map, a car­to­gram and a na­tion­al county-by-county map.

    To render the maps we chose to use a JavaS­cript lib­rary called Raphael, which uses SVG in mod­ern browsers and falls back to VML in In­ter­net Ex­plorer 8 and be­low. Raphael was an ex­cel­lent solu­tion for us be­cause it’s a re­l­at­ively light­weight lib­rary, and the maps can be rendered quickly not only on the res­ults de­tail page but also on wid­gets on the homepage and mo­bile homepage. It renders the maps beau­ti­fully and simply, on a plain back­ground, and in a way that lets the data shine without visu­al in­ter­fer­ence from geo­graph­ic fea­tures, roads, build­ings or street names.

    Raphael is ex­cel­lent at what it does, but it’s not a map­ping lib­rary like Leaf­let and can’t handle raw geo­graph­ic data and map pro­jec­tions eas­ily like D3. So one of the primary chal­lenges throughout the de­vel­op­ment of our maps was turn­ing GIS shapefiles in­to SVG paths that Raphael could handle.

    Enter shp2s­vg

    Today we’re an­noun­cing the re­lease of shp2s­vg, an open-source Django-powered web­site that stream­lines and sim­pli­fies the con­ver­sion of GIS shapefiles to SVG paths. It al­lows you to up­load a SHP file (along with the DBF, PRJ and SHX files that come with it) and spits back SVG paths as JSON or an ed­it­able SVG file. You can render the paths at vir­tu­ally any size and with any one of the sev­er­al thou­sand map pro­jec­tions in­cluded with Post­GIS.

    It’s de­signed with SVG lib­rar­ies like Raphael in mind, so it only sup­ports con­vert­ing poly­gons and mul­ti­poly­gons in­to SVG path ele­ments. At this time shp2s­vg does not sup­port oth­er fea­tures in shapefiles, like points and poly­lines, though it may in the fu­ture. To aid in the cre­ation of pro­por­tion­al sym­bol maps, shp2s­vg can also in­clude the centroids for each poly­gon.

    What’s in the box

    Shp2s­vg is be­ing re­leased as a product that de­velopers can host loc­ally for in­tern­al use, sim­il­ar to the Panda Pro­ject. It in­cludes a fully func­tion­al front end built with Boot­strap, which can handle up­load­ing new shapefiles to the data­base or gen­er­at­ing new SVGs us­ing ex­ist­ing shapefiles. Shapefiles can be up­loaded as in­di­vidu­al com­pon­ents or zipped.

    The Times is not cur­rently host­ing shp2s­vg on the Web for pub­lic use, though oth­er de­velopers are cer­tainly wel­come to do so. If you have any ques­tions about how to get set up check out the readme or con­tact me.

    How we did it

    The site is built with GeoD­jango, which makes it easy to load geo­graph­ic data in­to a data­base, query it and ma­nip­u­late it. When a user up­loads a shapefile to the site, the files are stored on a par­ent mod­el called Shapefile­Con­tain­er. Then, us­ing Data­Source, we loop through all of the poly­gons and mul­ti­poly­gons in the shapefile, con­vert the poly­gons to mul­ti­poly­gons for con­sist­ency and store each one on a child mod­el called Shape.

    Once this is done, we can ac­cess the group of Shape ob­jects as a GeoQuerySet and use Django to pro­ject them and ex­tract their co­ordin­ates for con­ver­sion in­to SVG paths.

    Django al­lows you to eas­ily ex­tract the co­ordin­ates from a mul­ti­poly­gon like this:

    >>> Shape.poly.coords
    ((((-12139744.795340238, 5012635.741547457), (-12138858.895844406, 4439094.57990306), (-12300383.11008835, 4440128.263675176), (-12514270.312389202, 4440420.01845591), (-12693820.280335575, 4438284.271136337), (-12694231.590815783, 5159949.186691979), (-12362056.671742627, 5161218.638162391), (-12362528.384461187, 5016462.986205476), (-12139744.795340238, 5012635.741547457)),),)

    But, as you can see, it can be dif­fi­cult to fig­ure out what’s go­ing on. Each tuple like this “(-12139744.795340238, 5012635.741547457)” is a pair of X-Y co­ordin­ates that you could plot on a graph. Since we’re stor­ing each item as a mul­tip­loy­gon, each in­di­vidu­al poly­gon with­in it is rep­res­en­ted here as a tuple of tuples. Each poly­gon tuple is nes­ted in an­oth­er, gi­ant tuple for the whole mul­ti­poly­gon. For ex­ample, each is­land of Hawaii could be a set of co­ordin­ates in the mul­ti­poly­gon of the whole state.

    These co­ordin­ates are large and un­wieldy, and they change dra­mat­ic­ally de­pend­ing on which pro­jec­tion and/or co­ordin­ate ref­er­ence sys­tem you’re us­ing. So in or­der to trans­late these co­ordin­ates in­to pixels, we need to scale them down and con­vert them ap­pro­pri­ately. To start off we need to pull out max and min X and Y co­ordin­ates, or the “ex­tent” of the shapefile:

    # make lists for the x and y coords
    x_coords = []
    y_coords = []
    # loop through our GeoQuerySet of projected shapes
    for i in projected_shapes:
        # grap the extent for each multipolygon
        coords = i.poly.extent
        # and append that to our master lists
        # then, we can get the extent for whole shapefile by grabbing the
        # max and min values for each list
        extent = (min(x_coords), min(y_coords), max(x_coords), max(y_coords))

    Then, we use the ex­tent to cal­cu­late a con­stant we can use to scale the co­ordin­ates to a screen res­ol­u­tion.

    # Get the absolute difference between the largest 
    # and smallest x and y coordinates
    max_translated_x = abs(extent[2] - extent[0])
    max_translated_y = abs(extent[3] - extent[1])
    # Use the user-supplied "max_size" to determine the scaling constant
    if max_translated_x > max_translated_y:
        scale_factor = max_size / max_translated_x
        scale_factor = max_size / max_translated_y

    Fi­nally, we can loop through all of the mul­ti­poly­gons in our GeoQuerySet, ex­tract the co­ordin­ates, trans­late them to 0, 0 on the X-Y ax­is (so they ap­pear in the top left of the SVG can­vas) and mul­tiply them by our scal­ing con­stant. Once we have a set of trans­lated and scaled co­ordin­ates we can feed them in­to this func­tion to con­vert them to an SVG path:

    def coords_2_path(coord_list):
        Takes a list of coordinates and returns an SVG path
        if len(coord_list) == 1:
            path = 'M%s,%sZ' % (coord_list[0][0], coord_list[0][1])
            path = 'M%s,%s' % (coord_list[0][0], coord_list[0][1])
            for i in coord_list[1:]:
                path += 'L%s,%s' % (i[0], i[1])
            path += 'Z'
        return path.replace('-0.0', '0').replace('0.0', '0').replace('.0', '')

    There’s a bit more to it than this, but you’re wel­come to check out all of the code in the Git­Hub repo for this pro­ject.

    What you can do

    If you have any ideas for fea­tures or en­counter any bugs, you’re wel­come to sub­mit a tick­et on Git­Hub. Or, if you want to help the pro­ject along, here are some fea­tures we could use:

    1. Add sup­port for points and poly­lines in shapefiles
    2. Add sup­port for shapefiles with mul­tiple lay­ers
    3. Add sup­port for pro­ject­ing shapefiles with cus­tom PROJ.4 strings

    Readers: What’s your take? Share it here.


    Latest work

      About The Data Desk

      This page was created by the Data Desk, a team of reporters and Web developers in downtown L.A.