Setting up a Blank SQL Server Spatially enabled Table using Microsoft SQL Server Management Studio 2008R2 Express and displaying it in QGIS 2.8.1

Programs used;

1- SQL Server 2008R2 Express
2- SQL Server Management Studio 2008R2 Express
3- QGIS

The example uses UK national grids coordinates to create a Triangle Polygon in a SQL Server Table

I’ve previously written that while we’ve had spatially enabled SQL Server for over 5 years I constantly come across line of business applications that although using SQL Server have not and do not intend to spatially enable the application. This is undoubtedly because of the difficulty in re-designing legacy systems actively in use and because the benefits although significant are not generally requested by all but the most knowledgable of colleagues.

While I understand this legacy system reasoning spatially enabled databases are the future so its just a matter of when and not if an application will require alteration. Understanding it in this context makes it really a requirement to start seriously planning for its inclusion.

Developerers creating new applications however should always consider spatially enabling relevant tables from the start even if it is not specked by the client/colleague. It being so much easier to spend a couple of minutes future proofing the schema of a new born database rather than hours trying to retrofit a live in production back end.

Firstly it’s important to understand what a geodatabase in SQL Server actually is.
Really it is a normal database which has one table that has a field that has a geometry or geography value type. In this example I will use desktop QGIS 2.8.1 to display the resulting geometry but any other digital mapping package that can link to SQL Server could be used. SQL Server also has a very rudimentary Mapping Display but you will need something better if you want to manipulate boundaries visually.

Many digital mapping products have plugins that will create Geodatabases and tables however I haven’t seen one for QGIS. I really wanted to be able to create spatial SQL tables on my own without recourse to paid tools directly in SQL Server Management Studio. So here’s my method of creating blank polygon table whose geometry is ready to be read and edited in QGIS or any other digital mapping system just using SQL Server Management Studio Express 08R2.

Steps
1. Create a new Table
2. Ensure the table has an identity Key that increments
3. Create a geometry column
4. Write a query that updates the geometry column

UPDATE T001Sites SET Coordinates=geometry::STGeomFromText(‘POLYGON((301804 675762,295789 663732,309581 664870,301804 675762))’,27700)

You will note that there are four coordinates here (each coordinate being a pair of numbers )
The first coordinate and last are the same this is required by SQL to ensure that the polygon is closed.

The 27700 number is the Spatial Reference System Identifier (SRID) – it is a unique value used to unambiguosly identify projecttion. 27700 is the unique identifier for the United Kingdom National Grid and represents the coordinates my example refer to. The spatial reference identification system is defined by the European Petroleum Survey Group (EPSG) standard which is a set of standards developmed for cartography surveying etc and owned by the Oil and Gas Producers Group list here; http://www.epsg-registry.org/

The above coordinates display a triangle in West Lothian near Edinburgh

5. Set up the connection to SQL Server Instance

Ensure the box marked “Only look in the geometry_columns metadata table” checkbox is unchecked. By default this is checked and if the geometry_columns table does not exist you will get an error message.

QGIS-SSMS-Connection

6. Display the table and edit as appropriate.

Select the table then hit the Add button

QGIS-SSMS-TableReadyforDisplay

And here is the SQL Server table in QGIS ready to be added to or edited.
QGISshowingSQLServerPolygon

Administering a Web Application some thoughts

In the last couple of years I have started administering a bought in Web Application from an outside vendor. I have found this really quite educational. I will not reveal the name of the product the vendor seems neither better nor worse than other vendors and many of the issues I am experiencing I see repeated with other admins of other web applications.

The software consists of various pieces of middleware linking SQL Server and the .Net and Javascript files. It uses a fairly standard Microsoft Stack to deliver a single page application. The big selling point was that that it was widely available through browsers pretty much anywhere on the web.

We have had it up and running in anger for a couple of years now and I have about 75 users all internal.

The vendor has created a new version and we now need to update. In discussion with the company they tell me that I should not be doing this myself and it is standard practice to get them in to upgrade the version. We already use it more as a stack technology to build our own applications I would want to continue this in the future. They have indicated that it will cost money to get them in to upgrade the software. This is triggering alarm bells because the existing configuration is very complicated, errors are quite regular and error messages tend to be pretty cryptic or non existent. I also like doing setup myself as it teaches me the ins and outs and speeds my learning up for systems which have non-trivial configuration.
A project board has been set up senior management are on board and well its grown legs arms and tails.

We have not widely publicised the application to the general public and they do not access the application at present.

The application itself requires significant amounts of configuration to set up correctly and parameters tend to be static and buried deep in very large HTML files. All this leads me to conclude that configuration in the new version is likely to be just as awkward. The company’s help desk is very good and there is extensive documentation but the sheer obscurity and quantity of configuration parameters makes the documentation the size of a phone directory, difficult and opaque.

Here’s some questions I should remind myself of when looking at New Web Applications in the future.
Are staff really so distributed that you require wide distribution?
Is it an internal or external distribution? – If not external do you really need a web enabled application.
Do you have internal expertise that is interested in developing applications?
With Web Applications or Local applications the backend database technology could well be the same. Internal local area development is likely to be far more flexible / cheap / quick and could motivate staff additionally it can be far more rich in its scope. It would be possible for instance to develop rich interface internally and a minimal interface externally using something like Lightswitch. This would take advantage of the strengths of both development environments.

There is a big push towards web applications I just think IT managers need to carefully consider their positions these applications tend not to be cheap and have a number of disadvantages that seem to have been pushed out of sight.

Database Normalisation – a review of the rules to refresh my knowledge

Here is a review of the 6 forms of normalisation (yes the 4th is called 3.5 for some reason) – there are lots of summaries on the web of these, but I wanted to re-write them in my own words to ensure I understand them fully. Revision of these post real world experience is particularly enlightening.

Normalisation rules are cumulative ie a database or set of tables can be said to be 1 through to 5 compliant. To rank at the highest level it must comply with all previous forms.

That is to say that a database (or table) to be in the 2nd Normal Form must first fulfill all the criteria of the 1st Normal Form.

A database is said to be properly normalised if the first three normalised forms are inviolate.

Violations of normalisation forms subsequent to the third normal form may be legitimately required and as such are NOT absolutely required for a database to be classified as properly normalised.

Some web explanations do not list forms post 3 I would argue there is still a lot of value in trying to adhere to these less well forms but must admit to not always including them in design usually due to poor knowledge of the application domain.

The differing forms of normalisation start simple becoming increasingly complicated.
If you have a lot of experience in creating relational backends you may have deduced these issues for yourself.

(1NF) First Normal Form – Edgar F Codd – first stated 1970
Eliminate duplicate columns from the same table.
Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

(2NF) Second Normal Form – Edgar F Codd – first stated 1971
No partial dependency on anything but the key fields

(3NF) Third Normal Form – Edgar F Codd – first stated 1971
Every non prime attribute of the primary key is non transitively dependent on every key of R. The example I saw was that if you had winners of wimbledon some of which repeated you would not store there birth dates. You would have a junction table of the winners and link to the names of the individuals and it would be in this table that you would store the dates of birth.

The 3.5 Normal Form or the Boyce Codd Normal Form
This is a slightly stronger version of the third normal form developed in 1974 by Raymond F. Boyce and Edgar F. Codd to address certain types of anomaly not dealt with by the 3rd Normal form as originally defined.

Only in rare cases does a 3rd Normal Form table not meet the requirements of Boyd Codd Normal Form. A 3rd Normal Form table which does not have multiple overlapping candidate keys is guaranteed to be in Boyce Codd Normal Form.

Web investigation indicates that the most concise description is;
Every determinant must be a candidate key

and that Boyd Codd Normalised Form is not always possible

Nearest Shops
Person Shop Type Nearest Shop
Mark Jewellers H Samuel
Mark Barbers Mr Man
Natalia Coffee Shop Starbucks
John Coffee Shop Costa
John Barbers Mr Man
John Jewellers H Samuel

The issue above is that Mr Man on record 2 could be changed to a Hairdressers for shop type but the second last record John would not change and as such it would seem to be inconsistent. In this case I would split this table into two with Nearest Shop and Shop Type being linked and there being some kind of table that links People to Nearest shops.

The 4th Normal Form
A table is in the 4th Normal Form if it has no multi-valued dependencies.
Definition: Multivalued dependencies occur when the presence of one or more rows in a table implies the presence of one or more other rows in that same table.

For example lets say you have a you have a shop selling coffee – lets call it Starbucks and you have a table that lists the permutations of cofee sold in its store
CoffeeShop Coffee Product Location
Starbucks Café Late Livingston
Starbucks Americano Livingston
Starbucks Café Late Edinburgh
Starbucks Americano Edinburgh

If we know that all Starbucks sell the same products if there is a new product in one shop then we would expect that new product to be in the other shops so adding a Skinny Late to the Livingston shop implies that there should be another record for Skinny Lates in the Edinburgh shop.

To eliminate the possibility of anomalies we must place the facts about the products and the location of the shops into two tables

Products
Starbucks Café Late
Starbucks Americano
Starbucks Skinny Late

Location of Coffee shops
Starbucks Livingston
Starbucks Edinburgh

Alternatively if the coffee types did vary from location to location the original table would satisfy 4NF

The 5th Normal Form
A table is in fifth normal form or Project-Join Normal Form if it is in the Fourth Normal Form and it cannot have a loss-less decomposition into any number of smaller tables.

Or as I like to call it the – I know about it and try to adhere to it but sometimes I’m a bit lazy form.

I certainly try to keep my designs to the full six normalisation principles but there are instances where certain tables do not comply.