Saturday, July 7, 2012

F# in SQL Server

SQL Server allows for routines written in .NET. There are plenty of examples everywhere. However, virtually all are written in C# or Visual Basic. I decided to give F# a try. I created a simple aggregate in F#, that provides a concatenation of input strings.

namespace _St

    [<SqlUserDefinedAggregateAttribute(Format.UserDefined, MaxByteSize = -1)>]
    type ConcatenationAggregate() =
       
       [<DefaultValue>] val mutable private agg : StringBuilder

       member private this.Append( s : string) =
           ignore <| this.agg.Append( if( this.agg.Length > 0) then ";" else "" ).Append( s )

       member public this.Init() = 
           this.agg <- StringBuilder()
       member public this.Accumulate(s : SqlString) =
           this.Append(s.Value)
       member public this.Merge(t2:ConcatenationAggregate) =
           this.Append(t2.agg.ToString())
       member public this.Terminate() =
           SqlString(this.agg.ToString())

       interface IBinarySerialize with
           member this.Read(r : BinaryReader) =
              this.agg <- StringBuilder( r.ReadString() )
           member this.Write(w : BinaryWriter) =
              w.Write( this.agg.ToString())

However, CREATE ASSEMBLY failed, because the referenced assembly fsharp.core was not present in the current database. So I tried importing the assembly to the database:

CREATE ASSEMBLY fsharp_core FROM 'c:\Program Files\FSharp-2.0.0.0\bin\FSharp.Core.dll'

but that failed too:

CREATE ASSEMBLY failed because type 'System.LazyFailure' in safe assembly 'FSharp.Core' has a static field 'undefined'. Attributes of static fields in safe assemblies must be marked readonly in Visual C#, ReadOnly in Visual Basic, or initonly in Visual C++ and intermediate language.

The solution was to import fsharp.core as UNSAFE assembly. Indeed, there was some more security stuff required for that. Luckily, there was a guidance on MSDN blogs.

CREATE ASYMMETRIC KEY CLR_SP_Key
FROM EXECUTABLE FILE = 'c:\Program Files\FSharp-2.0.0.0\bin\FSharp.Core.dll'
GO
CREATE LOGIN CLR_SP_Login
FROM ASYMMETRIC KEY CLR_SP_Key
GO
GRANT UNSAFE ASSEMBLY TO CLR_SP_Login
GO
USE MyTestDatabase;
GO
CREATE USER CLR_SP_Login FOR LOGIN CLR_SP_Login
GO
CREATE ASSEMBLY fsharp_core 
FROM 'c:\Program Files\FSharp-2.0.0.0\bin\FSharp.Core.dll'
WITH PERMISSION_SET = UNSAFE

Then it was rather straightforward.

CREATE ASSEMBLY ConcatenationAggregate
FROM 'C:\Test\ConcatenationAggregate.dll'
GO
CREATE AGGREGATE Test.CatAgg (@s nvarchar(max))
RETURNS nvarchar(max)
EXTERNAL NAME ConcatenationAggregate.[_St.ConcatenationAggregate]
GO
SELECT Test.CatAgg(EmailAddress) FROM Test.Person;

Tuesday, July 3, 2012

Hierarchical Data with FOR XML

I was playing with selection of hierarchical XML using FOR XML clause of Microsoft SQL Server. I was using AdventureWorks2008 database and wanted to select something like

<Root>
  <Group Name="Europe">
    <Country Code="DE">
      <Area>Germany</Area>
    </Country>
    <Country Code="FR">
      <Area>France</Area>
    </Country>
    <Country Code="GB">
      <Area>United Kingdom</Area>
    </Country>
  </Group>
  <Group Name="North America">
    <Country Code="CA">
      <Area>Canada</Area>
    </Country>
    <Country Code="US">
      <Area>Northwest</Area>
      <Area>Northeast</Area>
      <Area>Central</Area>
      <Area>Southwest</Area>
      <Area>Southeast</Area>
    </Country>
  </Group>
  <Group Name="Pacific">
    <Country Code="AU">
      <Area>Australia</Area>
    </Country>
  </Group>
</Root>
from Sales.SalesTerritory table.

FOR XML PATH

First, I tried to do that using FOR XML PATH. It was nothing particularly difficult. The only thing that is worth mentioning is that subqueries must be used to allow for grouping. (With grouping, I mean having multiple children inside a single parent node.)

SELECT [Group] [@Name],
        (SELECT [CountryRegionCode] [@Code],
                (SELECT t3.[Name] [text()]
                FROM Sales.SalesTerritory t3
                WHERE t3.[Group] = t2.[Group] 
                AND t3.[CountryRegionCode] = t2.[CountryRegionCode]
                ORDER BY [Group], [CountryRegionCode], [Name]
                FOR XML PATH('Area'), TYPE)
        FROM Sales.SalesTerritory t2
        WHERE t1.[Group] = t2.[Group]
        GROUP BY t2.[Group], t2.[CountryRegionCode]
        ORDER BY t2.[Group], t2.[CountryRegionCode]
        FOR XML PATH('Country'), TYPE)
FROM Sales.SalesTerritory t1
GROUP BY [Group]
ORDER BY [Group]
FOR XML PATH ('Group'), ROOT('Root')

FOR XML AUTO

It is not possible to mix attributes and elements when using FOR XML AUTO. However, if we do not mind having <Area Name="Germany" /> instead of <Area>Germany</Area>, it is still an option.

An interesting thing is that grouping is done automatically when selecting from joined tables. (Let me remind that FOR XML PATH requires subqueries for that to work.)

SELECT [Group].[Group] [Name]
 , [Country].[CountryRegionCode] [Code]
 , [Area].[Name]
FROM Sales.SalesTerritory [Group]
JOIN Sales.SalesTerritory [Country] 
 ON [Group].[Group] = [Country].[Group]
JOIN Sales.SalesTerritory [Area] 
 ON [Country].[Group] = [Area].[Group] 
 AND [Country].[CountryRegionCode] = [Area].[CountryRegionCode]
GROUP BY [Group].[Group], [Country].[CountryRegionCode], [Area].[Name]
ORDER BY [Group].[Group], [Code], [Area].[Name]
FOR XML AUTO, ROOT('Root')

FOR XML EXPLICIT

This is rather specific mode. The proper ordering is essential. (Note the Tag in ORDER BY clause.) That said, it works all right.

SELECT 1 Tag
 , NULL Parent
 , NULL [Root!1!!element]
 , NULL [Group!2!Name]
 , NULL [Country!3!Code]
 , NULL [Area!4!!element]
UNION ALL
SELECT 2, 1, NULL, [Group], NULL, NULL
FROM Sales.SalesTerritory
GROUP BY [Group]
UNION ALL
SELECT 3, 2, NULL, [Group], [CountryRegionCode], NULL
FROM Sales.SalesTerritory
GROUP BY [Group],[CountryRegionCode]
UNION ALL
SELECT 4, 3, NULL, [Group], [CountryRegionCode], [Name]
FROM Sales.SalesTerritory
ORDER BY [Group!2!Name], [Country!3!Code], Tag
FOR XML EXPLICIT

Conclusion

Both FOR XML PATH and FOR XML EXPLICIT can provide the required XML. The output of FOR XML AUTO would require an additional transformation, as it provides all data as attributes.

While FOR XML PATH is in general the most recommended mode, I did not like the fact I needed to use correlated subqueries. This very case is perhaps too tiny (10 rows, that is), yet the IO statistics are remarkable.

modescans reads
EXPLICIT36
AUTO3104
PATH10(!)140

That is, if you need to select XML and performance matters, it is always a good idea to try multiple FOR XML modes.

Friday, May 25, 2012

HtmlAgilityPack in F# Interactive

While using HtmlAgilityPack (version 1.4.0 Stable) in F# Interactive (of Visual Studio 2010), I was unable to access attributes of nodes. In order to debug, I converted my script into a compiled code and soon was notified that HtmlAgilityPack requires a reference to the System.Xml, Version=2.0.0.0 assembly. After switching the target framework of my project to .NET 3.5, I was able to reference the required assembly and all worked like a charm.