Thursday 1 November 2007

Custom SSIS Control Flow Task

Okay, so if you're reading this chances are you're looking to create a custom SSIS task using C# to do your bidding. Either that or you've just stumbled across this blog on your tea break whilst surfing the web. Either way, you have my sympathies. ;-)

So, why this entry? Well, one morning I woke up and suddenly thought "how cool to write a custom SSIS task that will process Analysis Services objects as defined by information in a relational database." Now I know I can use a "for-loop" task along with an Analysis Services DDL task but (a) where's the challenge in that (b) I want access to the lower level calls and information that SSAS returns and (c) how does that help me understand the SSIS internals I ask you?

Now as it happens you may, like I did, have the dismal mis-conception that it should be fairly easy to create a "nice" custom task, what with all the information in BOL and on the internet. Alright, maybe that's a bit harsh. To create the task itself isn't really that hard I suppose, but it's the stuff around it that I find really annoying and time consuming. Things that really should be easy, aren't. Stuff that you have to do before you even get to build the task is just badly documented or scattered around, not in one place. In some cases there's even confusion amongst developers and posts are clearly incorrect. Only, it's only clearly incorrect after you know the answer.

So, in a nutshell, here are my thoughts and findings. If you are a seasoned C# developer then maybe these things are trivial, in which case this blog isn't for you. However, for the rest of us:

  1. The SSIS model, as compared to DTS, is actually okay. Microsoft appear to have thought about this somewhat and came up the logic of allowing the separation of the UI from the task, such that at runtime the amount of code to be loaded is reduced etc. Good idea! They even specify that that is the best practice. <rant><sarcasm>Well, thank goodness for Microsoft's life saving documentation then!</sarcasm>So why is it, that if according to Microsoft's own recommendations for splitting the UI from the task, does their own "definitive guide", and by that I mean the sample to which we are continually referred, for creating custom tasks have the UI and the task in one assembly? Just how is that helpful?</rant>

    This reference to the Increment Task includes the following statement:
    "A separate assembly for the user interface is recommended for ease of deployment because it allows you to update and redeploy the connection manager or its user interface independently."
    And this Microsoft link states:
    "A task and its user interface are separate entities. They should be implemented in separate assemblies to reduce localization, deployment, and maintenance work."
    Further "Increment Task" sample links:
    Creating a Custom Increment Task
    or
    Creating a Custom Increment Task
  2. So you would like to use your own icon for your task eh? Pity lots of documentation out their is either grossly inaccurate or totally out of date.
  3. Resizing list boxes smoothly by keeping it in sync. with a PropertyGrid when the form is resized, like the Analysis Services DDL Task, isn't a particularly intuitive thing to achieve.
  4. Then there's the PropertyGrid. It seems to me that with the ".net" languages Microsoft discovered the concept of reflection and like XML, it must be used everywhere! Now don't get me wrong, I'm not saying that they didn't know about it before. I mean Delphi (not MS I know) had RTTI etc. so it's a general concept but it just seems that everything uses reflection, including the PropertyGrid.
  5. Anyone found any decent documentation on using ConnectionManagers? So we have a ConnectionManager that can define a connection of varying types which is cool. Very handy. The connections can be OLE DB, ADO, ADO.NET, MSOLAP.2 (or 3) etc. Yet most of the documentation seems to imply that you can just call AcquireConnection, cast the object returned to an object of type SqlServer and "Bob's your uncle". Yeeeeesssss, if only.

Those are the ones that have caused me the most grief so far. I'll add more as and when I find them, but in the meantime, to save you the pain, here are my hints and tips to prevent this from happening to you, starting with the concept of keeping the task and UI separate.


Separating the UI from the task
I reckon that this is a solid approach and one which should be followed as a matter of course. Once you've worked out how it all works it actually wouldn't be that hard to knock up some boilerplate code or a project set at a starting point for all future tasks. The idea is as follows:

  • Identify all the properties about the task that you need to capture from the user. Things like connection types and connection managers, starting values and filter conditions etc. Once you know what they are, you're pretty much set.

  • In the task (i.e. the class that extends Task and contains the methods InitializeTask, Validate and Execute), create private, class level variables of the appropriate type for each of the properties identified above. Naturally include the DtsTaskAttribute required to link the task to the UI as described in all documentation and the samples etc.

  • DtsTask DtsTaskAttribute
  • In the same class, create properties for the task (i.e. "methods" using get and set) allowing each of the required variables identified above to be set and retrieved.

  • Property
  • Create a new assembly for the UI. Add a class that extends Form, and include the class identified by the DTSAttribute annotating the Task that is to be called to instantiate the UI. The constructor for the form should, at the very least, require the TaskHost to be supplied on instantiation in order to allow the retrieval and setting of the Task's properties. Note that there is nothing really special here. Just follow the standard documentation regarding building a UI for your task but do so in a separate assembly.

  • UI Class
    C# SSIS custom task UI
    UI Class Methods (Initialize method repeated here)
    Note the "GetView" method that passes the TaskHost object into the form's constructor.
    C# SSIS custom task methods
  • In the constructor for the form, record the TaskHost variable passed into the constructor in a class level global variable of type TaskHost for ease of access.

  • SSIS task UI form
  • Now to save any changes made in the UI back to the Task's property methods, use code similar to the following, where "g_TaskHost" is the global variable for the TaskHost object, "SQLConnectionType" is the property created on the Task and "ADO.NET" is the value to which it should be set.
    g_TaskHost.Properties["SQLConnectionType"].SetValue
    (g_TaskHost, "ADO.NET")
    ;

  • To load the values from the Task for display in the UI, use code similar to the following, using the same conventions as described in the point immediately above.
    g_TaskHost.Properties["SQLConnectionType"].GetValue
    (g_TaskHost).ToString();


Custom Icon
I have seen a lot of documentation on the web regarding the value for "IconResource" as it appears in the DtsTaskAttribute and what it should be. They state things like the name of the icon must be "MyTaskIcon.ico" and/or the resource must be fully qualified with the namespace etc. This may or may not have been correct in beta versions etc. but as it stands today you can call the icon whatever you want and the qualified name must be the name of the assembly in which the icon resides "dot" the name of the file.

See: DtsTaskAttribute.IconResource

So, if your assembly is called: "foo.bar.dll" using a namespace of "me.baz" and the name of your icon is "Designer.ico", then the correct value for "IconResource" is: foo.bar.Designer.ico


Resize List Boxes in sync. with a PropertyGrid
Set the ListBox property "IntegralHeight" to "False". That's it!


Using PropertyGrids
Hmmmm, this is a section all on its own. If anyone ever actually reads this blog and would like some information on this I'll post more but beware of what you ask for...


ConnectionManagers
I've yet to follow this through myself. Currently I'm coding my task specifically around the ADO.NET connection type which allows me to cast the connection to "SqlServer". I will support more connection types later.

My feelings thus far are that to accommodate different connection types in a way that minimizes code rework, one could either create a class that encapsulates the nuances of the different connection types, providing common functionality, or use method overloading in some way.



Well that's it. If you got this far either pat yourself on the back for your perseverance or shake your head in disgust at how geeky you really are. :-)

'Till next time...
Phil