Neo4j & SSIS – Connecting and executing Cypher in a Control Flow

Last Friday, Anabranch released the first beta version of it’s connector to Neo4j from SSIS. Aside from a post saying that it existed, I didn’t go into detail, so this is going to be a series of posts on how you can use your existing SSIS infrastructure with Neo4j.

Today we’re going to look at 2 parts of the connector, the Neo4j Connection Manager (CM) and the Execute Cypher Task (ECT). The CM is fundamental to all the controls, without it, you can’t connect to the database. I’ll go into what it does, settings etc in another post, but for now – it’s enough to know that it provides the connection. The ECT allows us to execute Cypher against a given connection manager.

** NOTE **
In version 1.0.0(beta) – the ECT will only work with the first CM you add to the package

This video covers the same topic as the text version below:

I’m going to develop this in Visual Studio 2017, at the time of writing – I found the 2019 SSIS packages to be a bit flakey, whereas the 2017 has been sturdy so far – from a ‘demo’ point of view though – the 2019 process is exactly the same after you have it all installed.

SETUP
If you’ve never developed against SSIS before, you’ll need a couple of things, firstly SSDT (specifically the Integration services bits), Visual Studio – I think the community edition should work, but I can’t confirm. You’ll also need the Anabranch Ssis Controls for Neo4j – assuming you’ve registered ( http://anabranch.co.uk/Projects/Neo4jSsis) and have the download link, you’ll want the 2017 x86 version of the controls – (for VS2019 as well!).

Download and install the controls. NB. You want to install these when Visual Studio isn’t running – as we’re in the heady world of the GAC here, and VS won’t find them unless it’s started with them there.

Do do this example yourself – you’ll also need a Neo4j database instance running, I’d recommend using the Neo4j Desktop as it makes it easier to manage the process.

Create your first package

1. Start up Visual Studio
2. Create a new Integration Services project

New Project…

3. In the new Package.dtsx file, we need to add a Connection Manager. Right click on the bottom ‘Connection Managers’ bar and add a Neo4j connection – if you don’t see it – you might have to restart Visual Studio, or possibly your machine.

Then select the Neo4j Connection:

You’ll now see it in the ‘Connection Managers’ section:

Select it – and change the connection properties to ones that match your database instance – at the moment this is done via the properties window:

At this stage, we have a connection – but we’re not using it, so let’s add a task to execute:

Drag the ‘Execute Cypher Task’ to the Control Flow, and double click on it. Then add the following Cypher:

CREATE (:Node {Id:1})

Press OK

Then we can execute the task:

Once that’s done:

If we go to our Neo4j Database, we can run:

MATCH (n:Node) RETURN n

If we look at the ‘Id’ property – we can see it is ‘1’

So. Now we have an SSIS integration package executing against a Neo4j database.

These controls are currently in an open beta, to register to get the controls, please go to: http://anabranch.co.uk/Projects/Neo4jSsis

Neo4j & SSIS

Neo4j and SSIS are awkward bedfellows – SSIS is Microsoft and has connectors to a plethora of database and technologies using ODBC, Web etc, and Neo4j is written in Java which provides a JDBC connection. SSIS however, does not work with JDBC.

#badtimes

Some of the clients I’ve worked with like using SSIS – (some don’t), and value their 20+ years of using a piece of technology, and want to leverage it with new technologies. Nothing says expensive like having to learn a new database and a new ETL tool.

So today I’d like to introduce you to the beta (maybe alpha) version of the Neo4j Connector for SSIS. It uses bolt to securely connect to your Neo4j instance and call Cypher against it.

Version 1 beta features:

  • Neo4j Connection Manager – manages the connection to the database, and securely encrypts your password (and that is actual encryption) making it safe for you to store.
  • Execute Cypher Task – Allows you to execute a piece of Cypher on a Neo4j instance as part of a Control Flow
  • Execute Cypher Destination – Allows you to execute Cypher against a Neo4j instance as part of a Data Flow
  • Both the above pictures show the basic syntax highlighting as well
  • Works with SSIS 2016, 2017 and 2019 (CTP 3)

Do you want to try it? You need the appropriate installer, there are 6 flavours (6!!), if you’re installing on a Server – you’ll probably want the x64 version of the Server version. What? I know – not that clear. If you’re installing on a SQL Server 2016 instance, use the SQL 2016 x64 installer.

To use on a local designer (VS 2017 or 2019) you’ll want the x86 SQL 2017 version – As the integration services addin for VS 2019 still uses the 2017 install locations.

Please go to here: http://anabranch.co.uk/Projects/Neo4jSsis to get a link via email!

Give me feedback and I’ll put more posts up on how to use some of the features shortly!