Saturday 15 January 2011

Using Powershell to load Ordnance Survey Code-Point CSV data

Another off-topic jaunt, but I have been playing with Microsoft's Powershell recently and quite enjoyed it.

Powershell is an interactive scripting language with strong ties to the .Net framework - meaning that .Net code can be accessed whilst scripting.  I found it easy to get going with and used it to import code-point data from 120 CSV files into a SQL Server database.

The script is based on Doogal Bell's excellent blog post in which he provides a C# program to import code-point.  I borrowed Doogal's .Net Coordinates library which was downloaded and accessed directly from within the script, simply by using the Add-Type command to locate the DLL file.  This library enabled me to easily convert the OSGB coordinates into latitude and longitude.

The script is provided below and could be used by a simple cut and paste and changing the first few lines to reference your own environment.

# Add reference to DotNetCoords
Add-Type -path "C:\Users\GrahaMorgan\Software\.net code\DotNetCoords.dll"

# Specify location of codepoint csv files
$dataLocation = "C:\Data\Ordnance Survey\codepoint_gb\Code-Point Open\Data"

# set up database connection
$connectionString = "Server=GRAHAMORGAN-PC\Spatial1; " +
                    "Database=pafdb; " +
                    "Integrated Security=SSPI;"
          
# Connect to database and open it
$sqlconnection = New-Object System.Data.SqlClient.SqlConnection $connectionString
$sqlconnection.Open()

$sqlcmd = New-Object System.Data.SqlClient.SqlCommand
$sqlcmd.Connection = $sqlconnection 

# get hold of all csv files
$files = get-childitem $datalocation *.csv | foreach-object {$_.name}

foreach ($file in $files)
{
    $filename = join-path $datalocation $file
    write-output("Processing " + $filename)

    $data = import-csv $filename -header ("postcode","B","C","D","E","F","G","H","I","J","easting","northing","M","N","O","P","Q","R","S")
    foreach ($row in $data)
    {  
        $OSRef = new-object DotNetCoords.OSRef($row.easting,$row.northing)
        $LatLng = $OSRef.ToLatLng();
        $latLng.ToWGS84();
              
        $sql = "insert into codepoint_coords (postcode, longitude, latitude, coordinates) " +
               "VALUES ('{0}', {1},{2}, geography::STPointFromText('POINT({1} {2})', 4326))" -f $row.postcode, $latLng.longitude, $latLng.latitude
         
        # Execute sql insert      
        $sqlcmd.CommandText = $sql
        $res = $sqlcmd.ExecuteNonQuery()
    }   
}

$sqlconnection.close()

Microsoft have made Powershell available from within SQL Server and it will be interesting to see what people do with it.  I guess one obvious use would be to take advantage of the database import tools which would probably provide much faster performance.  On the other hand, the flexibility of the script to manipulate the data, such as changing coordinate system is pretty nice.

Note syntax highlighting via Alex Gorbatchev's open-source SyntaxHighlighter; instructions on MLA Wire.