Managing Data in SQL Server Projects

I recently did a whole bunch of work with trying to automate ClickOnce project builds. While I was unfortunately unsuccessful in achieving that goal, I did learn a ton about MSBuild, and had what I felt was a great idea about how those of us that use SQL Server projects in Visual Studio can manage our application’s reference data. This post shows you how.

I love SQL Server projects in Visual Studio. They provide us with a very simple method for defining our database, and versioning it’s definition over time. Unfortunately, they leave two large gaps:

They make it difficult to migrate your database over time, across revisions
They do not provide a great mechanism for managing data

It’s debatable how valuable the first item is. You can always automate your database deployment using the files produced by SQL Server projects and the SQL Package command line utility (in fact, I bundled this in a Windows Installer once, so that databases could be migrated at install time). The second item is less debatable to me, and it’s been acknowledged by the product team. The support that’s already offerred may work well for some teams, and I somewhat like the idea of using one or more Stored Procedures for deploying data. Still, I thought there was probably a different way to manage this, and it’s bugged me for almost two years.

For the uninitiated, SQL Server Projects are project templates available when you install Visual Studio. They have full MSBuild support, and enable you to deploy a database project using Visual Studio. To learn more, you can have a look at this Microsoft documentation page.

Of course, a lot of teams can’t deploy using Visual Studio: they don’t have the necessary permissions in the production environment; they want the process to be more repeatable, or more automated; or a host whole of other reasons. My team is one such team. We discussed and evaluated several options:

Purchasing a separate solution, such as the tools from Redgate.
Managing individual SQL scripts for managing the database
Using SQLPackage to automate deployment

The first option was less than desirable for my team. From what I understand, the Redgate tools are fantastic, but we wanted to be sure we had the discpline to follow a process prior to making an investment in additional technology. The second option, managing individual scripts, would have been a great option, if we didn’t have to manage a bunch of projects that were already deployed. For new projects, this was obviously the most desirable option (without additional investment). Teams have managed their projects this way for a very, very, long time.

The last option is one that I was interested in for some of my team’s projects. The glaringly obvious advantage is that you can auto-roll the database between versions, and can potentially manage migrations among various developers. Like many enterprise teams, my team was spread across multiple projects, so if we could get this working, it would make a lot of sense. The team would have been able to bring any managed database from it’s previous version to the latest version (in source control). Who wouldn’t want that? And it could also manage the databases already in production?

There was one “gotcha”: some of our databases have heaps of reference data, and we had to work out how to manage it. SQL Server Projects do not provide a great way for managing this. You can tell Visual Studio that you want a Pre-deployment script, a Post-deployment script, or both, but you can only define one of those. You cannot define multiples of either one, which can lead to very large scripts. I personally don’t like dealing with large SQL Scripts, and found this to be fairly unattractive.

So, Brian, where’re you going with this? I thought you said you had a solution?

In the last few weeks, I had an “Aha!” moment: there was probably a way to build these scripts at compile-time for the entire project. And it turns out, this is fairly trivial to do.

To start, let’s define a simple SQL Project file:

<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0">
  <Import Project="$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props" Condition="Exists('$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props')" />
  <PropertyGroup>
    <Configuration>Release</Configuration>
    <Platform>AnyCPU</Platform>
    <Name>ExampleDB</Name>
    <SchemaVersion>2.0</SchemaVersion>
    <ProjectVersion>4.1</ProjectVersion>
    <ProjectGuid>{747f3381-0743-4aa0-82a5-79ae0e623cfd}</ProjectGuid>
    <DSP>Microsoft.Data.Tools.Schema.Sql.Sql110DatabaseSchemaProvider</DSP>
    <OutputType>Database</OutputType>
    <RootPath>
    </RootPath>
    <RootNamespace>ExampleDB</RootNamespace>
    <AssemblyName>ExampleDB</AssemblyName>
    <ModelCollation>1033,CI</ModelCollation>
    <DefaultFileStructure>BySchemaType</DefaultFileStructure>
    <DeployToDatabase>True</DeployToDatabase>
    <TargetFrameworkVersion>v4.5.2.</TargetFrameworkVersion>
    <TargetLanguage>CS</TargetLanguage>
    <AppDesignerFolder>Properties</AppDesignerFolder>
    <TargetDatabaseSet>True</TargetDatabaseSet>
    <DefaultCollation>SQL_Latin1_General_CP1_CI_AS</DefaultCollation>
    <CompatibilityMode>110</CompatibilityMode>
  </PropertyGroup>
  <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">
    <OutputPath>bin\Release\</OutputPath>
    <BuildScriptName>$(MSBuildProjectName).sql</BuildScriptName>
    <TreatWarningsAsErrors>False</TreatWarningsAsErrors>
    <DebugType>pdbonly</DebugType>
    <Optimize>true</Optimize>
    <DefineDebug>false</DefineDebug>
    <DefineTrace>true</DefineTrace>
    <ErrorReport>prompt</ErrorReport>
    <WarningLevel>4</WarningLevel>
  </PropertyGroup>
  <PropertyGroup>
    <VisualStudioVersion Condition="'$(VisualStudioVersion)' == ''">11.0</VisualStudioVersion>
    <!-- Default to the v11.0 targets path if the targets file for the current VS version is not found -->
    <SSDTExists Condition="Exists('$(MSBuildExtensionsPath)\Microsoft\VisualStudio\v$(VisualStudioVersion)\SSDT\Microsoft.Data.Tools.Schema.SqlTasks.targets')">True</SSDTExists>
    <VisualStudioVersion Condition="'$(SSDTExists)' == ''">11.0</VisualStudioVersion>
  </PropertyGroup>
  <Import Condition="'$(SQLDBExtensionsRefPath)' != ''" Project="$(SQLDBExtensionsRefPath)\Microsoft.Data.Tools.Schema.SqlTasks.targets" />
  <Import Condition="'$(SQLDBExtensionsRefPath)' == ''" Project="$(MSBuildExtensionsPath)\Microsoft\VisualStudio\v$(VisualStudioVersion)\SSDT\Microsoft.Data.Tools.Schema.SqlTasks.targets" />

  <PropertyGroup>
    <PostDeploymentScript>Deployment\Post-Deployment.sql</PostDeploymentScript>
  </PropertyGroup>
  
  <ItemGroup>
    <Folder Include="Data\" />
    <Folder Include="Deployment\" />
    <Folder Include="Properties" />
    <Folder Include="Tables\" />
  </ItemGroup>
  <ItemGroup>
    <Build Include="Tables\EXAMPLE.sql" />
    <PostDeploy Include="$(PostDeploymentScript)" />
  </ItemGroup>
  <ItemGroup Label="DeploymentScripts">
    <None Include="Data\0001_EXAMPLE.sql" />
  </ItemGroup>

  <!-- Here's where we'll put our consolidation script. -->
</Project>

For brevity, I’ve excluded a fair bit of the content that would normally be included in this file. I’ve also removed things like the Debug configurations, because they’re not important for this example. What is important is the definition of the EXAMPLE table:

CREATE TABLE [dbo].[EXAMPLE] (
  [EXAMPLE_ID]    INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
  [CREATED_DATE]  DATETIMEOFFSET NOT NULL DEFAULT GETUTCDATE(),
  [EXAMPLE_NAME]  VARCHAR(20) NOT NULL
);

Now that we have our simple example, let’s show the definition of the 001_EXAMPLE.sql script, shown in the DeploymentScripts ItemGroup:

PRINT N'Populating the [dbo].[EXAMPLES] table...';
GO

MERGE [dbo].[EXAMPLE] AS TargetExamples
USING (VALUES
    (N'Example1'),
    (N'Example2'),
    (N'Example3')
) AS SourceExamples ([SOURCE_NAME])
ON (TargetExamples.EXAMPLE_NAME = SourceExamples.SOURCE_NAME)
WHEN NOT MATCHED BY TARGET THEN
    INSERT (EXAMPLE_NAME)
    VALUES (SOURCE_NAME)
OUTPUT $action, inserted.*;

GO

If you’re not familiar with it, the MERGE keyword makes it possible to compare the data between a user-defined inline table and a table in SQL Server. Next, we’ll add an MSBuild target to manage script generation. The first thing we’ll do is add a PowerShell script, Concatenate.ps1:

param (
    # Specifies the list of files to consolidate as a ';'-delimited list.
    [Parameter(Mandatory = $true)]
    [string]
    $Inputs,

    # Specifies the location to write to.
    [Parameter(Mandatory = $true)]
    [string]
    $OutFile
)

# Resolves a list of strong file references from the file system.
$scripts = $Inputs.Split(';') | Get-Item

# Prints script names to the console, which become visible to MSBuild.
$scripts | ForEach-Object { Write-Host "`tRx file '$($_.FullName)'" }

# Overwrites the existing file, if it exists.
New-Item -ItemType File -Path $OutFile -Force | Out-Null

# Concatenate all the files, and write them to the target location.
$scripts | ForEach-Object { $_ | Get-Content | Out-File $OutFile -Append }

With that, we can add our MSBuild extensions:

<Project>
  <!-- Here's where we'll put our consolidation script. -->
  <Target Name="CreatePreAndPostDeployScripts" BeforeTargets="CoreCompile;">
    <ItemGroup>
      <PostDeployScripts Include="Data\**\*.sql" />
    </ItemGroup>

    <PropertyGroup>
      <ConcatenateScript>$(MSBuildThisFileDirectory)Concatenate.ps1</ConcatenateScript>
    </PropertyGroup>

    <Exec Command="powershell.exe -NoProfile -ExecutionPolicy Bypass &quote;$(ConcatenateScript)&quote; &quote@(PostDeployScripts)&quot;" />
  </Target>
</Project>

By including these attributes in your project, along with running this custom MSBuild task before the CoreCompile task, you can easily generate the data file (or files) that are later consumed in the build toolchain. Best of all, by using the T-SQL MERGE syntax, you can easily guarantee that your deploy scripts are safe. If your process is such that developers are not engaged in production deployments, this becomes an increasingly useful property. I personally try to test as much of the deployment tool chain as possible as often as possible, because it can be difficult to get right without proper infrastructure support (such as: creating clean environments for testing build and deployment).

That pretty much covers this topic. I hope it helps you and your team.

Thanks for reading!

- Brian