SQL Server Get Average Wind Direction With User Defined Aggregate Functions

Imagine you have a wind direction device that is logging periodical values to your SQL Server database. You are asked to obtain a 30min average of the wind direction. Easy, you say. A simple ‘SELECT AVG(WindDir) FROM LogValues’ will do the trick! But alas it will not. This article will, step by step, guide you through the process of creating the wind direction aggregate function and using it in a query.

Overview

Example solution

Imagine you have a wind direction device that is logging periodical values to your SQL Server database. You are asked to obtain a 30min average of the wind direction.

Easy, you say. A simple ‘SELECT AVG(WindDir) FROM LogValues’ will do the trick! But alas it will not. The standard AVG function in SQL Server does not address averages when applied to circular measurements in degrees (or any other engineering unit for that matter). Take a scenario where the wind is blowing at 350 degrees one moment and then 10 degrees the next (as in the picture below). A normal average function will calculate (360+10) / 2 = 185 degrees, the opposite direction! Epic fail.



Clearly the standard average aggregate function is not sufficient for our needs, fortunately we can create a user-defined aggregate function in SQL Server to do as we please. This article will, step by step, guide you through the process of creating the wind direction aggregate function and using it in a query.

Requirements

You are going to need the below software in order to complete this guide:
1. MS SQL Server 2005 or above (I have used SQL Server 2008)
2. MS Visual Studio 2005 or above (again I have used the 2008 version)

What is a User-defined Aggregate Function?

An aggregate function in SQL Server is a function that provides a service (such as a calculation) over groups of data; examples are SUM, AVG, MIN and MAX just to name a few. A custom aggregate function is one that we develop and use, per database, as required.

Having a Look Around

Where are these Aggregate Functions anyway?

1. Open Microsoft SQL Server Management Studio and connect to you SQL Server instance
2. Navigate to your database in question, or create a new one. The database I am using is called SQLEXAMPLEDB
3. Now expand Databases | SQLEXAMPLEDB | Programmability | Functions |Aggregate Functions.


4. Right click the ‘Aggregate Functions’ folder. Notice that you do not have the option to create a new function. This is because Aggregate Functions need to be programmed in .NET and compiled as an assembly, hence there is no interface in Management Studio for creating them.

5. So, the Aggregate Functions will be stored somewhere on your hard drive (not necessarily the GAC) and will be referenced from the Aggregate Functions folder in each database it is loaded in to.


A Visual Delight

Visual Studio is now required to create the Aggregate Function, it isn’t anything special or difficult – just a simple dll in the end.

1. Open Microsoft Visual Studio 2008 (or other)
2. File | New | Project
3. The ‘New Project’ dialog will open (pictured below). Select your language of choice (I only have C# installed) and then select ‘Class Library’



4. Enter ‘SQLAggExample’ as the name of the project and press ‘OK’
5. You will be presented with ‘Class1’ as below.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

namespace SQLAggExample

{

public class Class1

{

}

}

Sorting out the Includes

The using references need to be changed, replace the default with:

using System;

using System.Data;

using Microsoft.SqlServer.Server;

using System.Data.SqlTypes;

using System.IO;

Classy Names

The class name of ‘Class1’ needs to be changed, replace it with ‘AvgWindDir’ and use your refactoring tool to conveniently change the filename too. Alternatively, in the Solution explorer, rename the ‘Class1.cs’ to AvgWindDir.cs manually.

Have you eaten your serial?

Serialization is the process of saving an objects state, to memory, for later retrieval. SQL Server will serialize your aggregate class as it sees fit, we can control the serialization by implementing the IBinarySerialize interface. This means we will have to implement a ‘read’ and ‘write’ member function to handle the de/serialization of the aggregate function members.

Add the IBinarySerialize interface to the AvgWindDir class:

public class AvgWindDir : IBinarySerialize

Insert ‘Read’ and ‘Write’ functions (below) in the class, we will enter the appropriate code for saving our persistent objects later.

public void Read(BinaryReader r)

{

}

public void Write(BinaryWriter w)

{

}

Finally add a [Serializable] attribute to the class, this indicates that the class supports serialization. Putting it all together:

using System;

using System.Data;

using Microsoft.SqlServer.Server;

using System.Data.SqlTypes;

using System.IO;

namespace SQLAggExample

{

[Serializable]

public class AvgWindDir : IBinarySerialize

{

public void Read(BinaryReader r)

{

}

public void Write(BinaryWriter w)

{

}

}

}

Is that my Attribute or yours?

How do we tell SQL Server that our class is a user defined aggregate function? By the attributes of course! The class needs to be marked with the SQLUserDefinedAggregate attribute, this tells SQL Server that the class within the dll is a user defined aggregate and what it supports.

Enter the attribute below the [Serialization] attribute, but above the class definition.

[Serializable]

[SqlUserDefinedAggregate(

Format.UserDefined,

IsInvariantToNulls = true,

IsInvariantToDuplicates = false,

IsInvariantToOrder = false,

MaxByteSize = 8000) ]

public class AvgWindDir : IBinarySerialize

Format is either ‘Native’ or ‘UserDefined’. Native format lets the CLR take care of the serialization, otherwise we get to specify it (hence the above step).

IsInvariantToNulls translated means, is your aggregate able to operate correctly and accurately if a NULL value is delivered to the function. For example a COUNT aggregate should not count fields with NULL values. We are not phased about this for this example as we will filter out NULLS anyway, so the value is set to TRUE.

IsInvariantToDuplicates means, is your aggregate able to operate correctly and accurately if a duplicate records are delivered to the function. We do not want duplicates to creep in our Average function, fine for a Min or Max. Hence this attribute is set to false.

IsInvariantToOrder is not used or relevant.

MaxByteSize is set to the maximum allowed value of 8000 bytes. We can refine this value later to the true size of the data we are serializing.

Classes that implement the SqlUserDefinedAggregate attribute must also support a variety of methods, the methods are the topic of the next section.

But First

The wind direction algorithm we will use will be based on a vector based approach. There are some limitations this solution in that the aggregate is not guaranteed an ordered time-series dataset and that wind strength is not going to be incorporated in the solution.

We need some variables to store the data, which we are defined below:

private double Vx; // To store the average sine of vectors

private double Vy; // to store the average cosine of vectors

private Int32 TotalRecords; // Total amount of records we have processed

Init

The first Aggregate method we need to implement is the Initialisation method ‘Init’, this will be called first by SQL Server when beginning the aggregate process. Here we will simply set our values to zero.

/// <summary>

/// Initialize the internal data structures

/// </summary>

public void Init()

{

Vx = 0;

Vy = 0;

TotalRecords = 0;

}

Merge

SQL Server is able to partially process aggregate queries and hence requires a way to merge aggregate results, we need to provide a method to do this. In our case we only need to add the internal variables from the merge object with the local variables.

The ‘other’ parameter is going to be another instance of the AvgWindDir class.

/// <summary>

/// Merge the partially computed aggregate with this aggregate.

/// </summary>

/// <param name="other"></param>

public void Merge(AvgWindDir other)

{

Vx += other.Vx;

Vy += other.Vy;

TotalRecords += other.TotalRecords;

}

Accumulate

Here is where the accumulation occurs, this function is called by SQL Server for each value in the recordset. The parameter that is passed to the method must be a SQL type (in our case SqlDouble) in order for SQL Server to pass the correct datatype and ensure the aggregate is applied to the correct data. We are expecting degrees as a SqlDouble (float) to be passed to the aggregate.

First we will check for NULLs, if the passed value is null then the method will immediately exit, otherwise we shall continue to accumulate.

The next step is to convert the angle to radians as we cannot work with angles directly.

double dRadians = ((double)(value / (180.00 / Math.PI)));

The radians are then used to accumulate the sine and cosine:

Vx += Math.Sin(dRadians);

Vy += Math.Cos(dRadians);

And keep track of the record count:

TotalRecords++;

Putting it together:

/// <summary>

/// Accumulate the next value, not if the value is null

/// </summary>

/// <param name="value"></param>

public void Accumulate(SqlDouble value)

{

if (value.IsNull)

{

return;

}

// convert from degrees to radians

double dRadians = ((double)(value / (180.00 / Math.PI)));

// get vectors

Vx += Math.Sin(dRadians);

Vy += Math.Cos(dRadians);

//Increment total record count

TotalRecords++;

}

Terminator

The ‘Terminate’ method is called when the aggregation is complete and SQL Server requires the result.

Here we calculate the average of the vectors by dividing Vx and Vy by the total count of records processed. Then calculate the arctangent using the two vectors as input (Atan2 method of the Math library).

The output of atan is in radians and needs to be converted back to degrees.

Finally we return the result to SQL Server, with the value cast to the correct type.

/// <summary>

/// Called at the end of aggregation, to return the results of the aggregation.

/// </summary>

/// <returns></returns>

public SqlDouble Terminate()

{

double output = 0;

// Calculate average of vectors

Vx /= TotalRecords;

Vy /= TotalRecords;

Vx *= -1;

Vy *= -1;

// arc tangent

output = Math.Atan2(Vx, Vy);

// convert from radians to degrees

output = output * (180.00 / Math.PI);

// find the correct degrees

if (output > 180)

output -= 180;

else

output += 180;

return new SqlDouble(output);

}

Finish eating your serial

Now we can complete our serialization methods:

public void Read(BinaryReader r)

{

Vx = r.ReadDouble();

Vy = r.ReadDouble();

TotalRecords = r.ReadInt32();

}

public void Write(BinaryWriter w)

{

w.Write(Vx);

w.Write(Vy);

w.Write(TotalRecords);

}

Compile to dll

The solution need to be compiled to a dll:

1. Change the active configuration to ‘Release’. This will be on the toolbar for VS 2008 users
2. Select ‘Properties’ from the ‘Project’ menu
3. Go to the ‘Signing’ tab
4. Tick ‘Sign the assembly’ tickbox
5. Select ‘<New>’ from the ‘Choose a strong key file name’ drop down box
6. Enter WindDirSK as the name and enter a password if you so wish
7. Finally build the project

You now should have a dll called SQLAggExample.dll stored in the ‘Release’ directory of your project. For me this file is located at: D:\Programming\SQLAggExample\SQLAggExample\bin\Release\SQLAggExample.dll

The output window will give you the location of yours, copy the file path text as we will now need to use this in SQL Server.

Installing it on SQL Server

SQL Server needs to know about this Aggregate function.

* Swap back to Microsoft SQL Server Management Studio and select the SQLEXAMPLEDB database (or whatever database you need to use this function in)

* Select ‘File | New | Query with current connection’, or simply click on the ‘New Query’ button on the toolbar

* Enter the following code in to the query window, changing the file path appropriately, and press ‘Execute’

CREATE ASSEMBLY SQLAggExample FROM 'D:\Programming\SQLAggExample\SQLAggExample\bin\Release\SQLAggExample.dll'

GO

CREATE AGGREGATE AvgWindDir(@input float)

RETURNS float

EXTERNAL NAME SQLAggExample.AvgWindDir

Now the aggregate is loaded in to the database, yet we may not be able to use it. The database needs to be configured to enable clr methods. In a new query window enter the following and press ‘Execute’:

sp_configure 'clr enabled', 1

GO

RECONFIGURE

GO

Take a look at the Aggregate Functions folder under Database | SQLEXAMPLEDB | Programmability | Functions | Aggregate Functions. You should now see the AvgWindDir function, ready to use.

How to use?

Like any function really, SELECT dbo.AvgWindDir(WindDirField) FROM TABLE.

Let’s generate some example data, in a new query window paste the following statement:

SELECT 350 as WindDir

union

SELECT 10 as WindDir

union

SELECT 5 as WindDir

union

SELECT 355 as WindDir

union

SELECT 7 as WindDir

This will generate a set of data for us, now how do we get the average?

Select dbo.AvgWindDir(a.WindDir)

FROM

(

SELECT 350 as WindDir

union

SELECT 10 as WindDir

union

SELECT 5 as WindDir

union

SELECT 355 as WindDir

union

SELECT 7 as WindDir

) a

Summary

I hope this guide has helped you understand how to create and use a user-defined aggregate function in SQL Server and how to address wind direction averaging. There can be many different opportunities to create your own aggregate functions and as you can see, it is not difficult or complex at all.




By Mark OMeara   Popularity  (3805 Views)