Reactive Neo4j using .NET

Version 4.0 of Neo4j is being actively worked on, and aside from the new things in the database itself, the drivers get an update as well – and one of the big updates is the addition of a Reactive way to develop against the DB.

Now – I’ve not done reactive programming for a long time, I think I did play around with it when .NET 4 was first released, but I have no idea where that blog post has gone – so I may as well start as new.

I found it! Not the post, but the application – MousePath – which is now on GitHub: MousePath – aside from it ‘working’ it’s not performant in any way.

What is Rx/Reactive?

Reactive in .NET is all about the IObservable<T>/IObserver<T> interfaces. They’ve been around since .NET 4, but personally I’ve never really used them. They allow application code to react to data being pushed to it, rather than the more traditional way of requesting the data.

There’s a good book (Intro to Rx) which I will been using to work this out, which is freely available online: .

Starting off

For this project, we’re going to need the nuget package – which in this case isn’t Neo4j.Driver – but Neo4j.Driver.Reactive. When we add this to our project – and create a driver in the normal way- we can see we now have an ‘RxSession‘ which is an extension method of the IDriver.

So let’s create a reactive session and see what we can see.


We get IObservable as opposed to the AsyncSession giving us Tasks


Doing a Run-ner

So, back to our RxSession, lets do a basic version, just using Run

var session = Driver.RxSession();
var rxStatementResult = session.Run("MATCH (m:Movie) RETURN m.title");
        record => Console.WriteLine("Got: " + record.Values["m.title"].As<string>())

In here, we’re hooking up to the ol’ classic Movies database, and simply writing the titles to the screen. NB – Driver is a static property of type IDriver I have defined elsewhere.

The first two lines look pretty much like our normal code – the only real difference being the use of the ‘RxSession‘ as opposed to just ‘Session‘.

Run on an RxSession returns an IRxStatementResult – which has 3 methods we’re interested in, (well actually only 1 at the moment) – Records(), Consume() and Keys().

Records() gets us the records from the database, so the stuff we want to do things with, Consume() whips through those records so we can get an IResultSummary telling us what is going on, and Keys() gets us the keys that are returned, in the simple statement I’ve done – ‘m.title‘ is the only key.

Records() is what we’re using, as we want to deal with the data, Records() return us an IObservable<IRecord> and being IObservable – we need to Subscribe() to it to get the data. Subscribing means we will provide an IObserver that will be notified whenever an IRecord arrives.

In this case, we have the contents being written to the console. Aces.


Being a console app – doing tiny amounts of work – I largely don’t need to worry about disposing of my resources, but let’s imagine resource usage is something we do care about. How do you go about disposing of your resources?

IDisposable? INosable! – the IRxSession doesn’t implement IDisposable, instead we have to Close<T>() it – and this is where things have got a little fuzzy for me – I’m not entirely sure I’m closing it correctly.

var session = Driver.RxSession();
var rxStatementResult = session.Run("MATCH (m:Movie) RETURN m.title");
        record => Console.WriteLine("Got: " + record.Values["m.title"].As<string>()));


Now, I expect to get either no results, or a smaller subset (depending on the speed of the code running) – what I get is the close being called, but still getting the full amount of data – I suspect I misunderstand what is going on here.

Let’s say I do want a smaller subset – or to quit – how do I do it? Well, the Subscribe() method actually returns an IObservable<IRecord> – which is also IDisposable – so we ‘unsubscribe’ by disposing of our subscriber:

var session = Driver.RxSession();
var rxStatementResult = session.Run("MATCH (m:Movie) RETURN m.title");
var subscription = rxStatementResult
    .Subscribe(record => Console.WriteLine("Got: " + record.Values["m.title"].As<string>()));

await Task.Delay(220);

The ‘delay’ magic number there is enough time to get some records, but not all, anything less than that gave no results, anything more – all the results. Β―\_(ツ)_/Β―

Ooook So – Why Rx?

It seems more complex right? Subscribe(), unsubscribe – no foreach in sight! What’s the point?

My understanding – and this could be/probably is wrong – is that by using Rx – we’re reducing our overheads – i.e. instead of streaming everything, we can just stream what we’re consuming at the time.

The other key benefits come from things like ‘.Buffer‘ and the other commands (Skip, Last, etc) allowing you to stream things in a better way.

One nice thing about Rx in .NET is that it’s not the same as async – you don’t have to have your entire stack in Rx to get the benefits – you can do bits and pieces where you need to – if you’ve got a lot of data maybe it makes sense for a given query.

Execute Cypher Task Updates

Last week, Anabranch released version 1.1 of the tools for Neo4j – which included a very welcome addition to the toolset – being able to pull data from a Neo4j instance.

After doing the demo post – I noticed a peculiarity – a quirk if you will with how the ‘Execute Cypher Task’ (see here) worked with multiple Neo4j Connection Managers defined – it would execute the Cypher against all the Connection Managers, not any specific one. This makes sense in some form – as a Control Flow Task doesn’t have a ‘Connection’ selector unlike a Data Flow Task.

Version 1.2 of the tools fixes this and tidies up the Execute Cypher Task to have a better user interface as well.

Adding Cypher

The Cypher box still has highlighting, but now lines up to the edges properly.

Choosing A Connection

You can choose from a drop down which connection you want to use. You will only see Neo4j Connection Managers here. The name of the Connection Manager will be the one you set it to.


You will get a red cross on your task if you’re missing things (in this case the connection). If you look in the ‘Error List’ you will be able to see all the errors:

Getting 1.2

To get version 1.2, please visit: register and you’ll be sent out the download link. Registration is only used to let you know of updates to the tools, no marketing!

Using a Data Flow to move data **from** Neo4j in SSIS

That’s right everyone! We’re going from Neo4j this time, and this is a new release, the old version ( didn’t have a ‘Neo4j as a Source’ component, does.

In the last post we took data from a file and ingested it into Neo4j, so far so good – but one of the things we were missing was the ability to also pull from Neo4j, now the circle is complete, and in this post – I’m going to show you how to pull from one Neo4j instance into another. That’s right – Neo4j to Neo4j!

As always – the video below shows the moving version of this post – but not everyone wants that.

The Setup

OK, more complex than normal, as we need multiple instances of Neo4j running – and whilst that’s not rocket science – it is more complex than normal. I don’t want to go into it particularly – but hey! I would run one DB in Neo4j Desktop, then download one of the server editions (for this Community will be just fine!)


You need to change the ports on your new server, as the Desktop ones will be using 7474/7687 etc – So open up the neo4j.conf file and change the following settings:

These are the ports I’m using – but go crazy and pick whatever you want – it’s your database after all. Aaaanyhews – I’m going to assume you know how to start your server version of the database. If not – there’s loads of stuff online about how to do it – and if it becomes clear that we’re in a world of pain here – I’ll write one πŸ™‚

Clear the DBs

WARNING!!! – which I don’t think we need – but here you go – make sure you know which DB you are doing this on! Don’t delete your production DB by mistake!!! (β€’_β€’)

On both the DBs we’re going to clear them, and add the ‘Movies’ demo set to one of them so – open up your browser window to both instances (http://localhost:7474 and http://localhost:7676 in my case) and execute:


Then in one of the databases (and I will be using my 7474 database) execute:

:play movies

And step to the second step and put the movie data into your database. You can check the data is all there by running:


You should get 171 nodes. OK, now we’re all set up and ready to go!

Let’s SSIS

As another assumption – I’m going with the fact that you know how to start up Visual Studio and create a new Package.

Let’s first add one connection:

And, obvs pick Neo4j:

Oooh – note the ‘version’ there as well – if yours says lower than that – then bad times πŸ™

Rename it to something like ‘The Source’ or whatever you find memorable:

Make sure the user / pass and server are all correct:

Looking good! Now – repeat – for the other server – remembering the port will be different – and choosing a different name, something like ‘The Destination’ for example, and you should end up with this state of affairs:

Let’s add a ‘Data Flow’ to our package now, again you can rename if you want. I did, but don’t let that force you into doing anything:

Double click on it, and we’re into Data Flow design heaven!

Add the Source

Drag the ‘Execute Cypher Source’ component from the toolbox onto the page:

Double click on it to enter the ‘Edit’ page:

The Cypher we’re going to execute is:

MATCH (m:Movie) RETURN m.title AS title

Now – some TOP TIPS. This works best if you RETURN specific columns, SSIS doesn’t know what to do with a full node, and using the AS there makes the output columns easier to use.

Once you’ve got the Cypher – you need to select the Connection to use (see the picture) – which is why naming them nicely is SUPER useful.

Once you’ve done that, hit ‘Refresh’ to get the Output Columns populated:

Job done. Good work!

Add the Destination

No surprises for guessing this involves dragging the Destination to the page.

Next, join up the Source to the Destination:

The UI for this is not as fully fledged out as the other, so unfortunately we need to head into the Advanced Editor. So Right click on it, and open the Advanced Editor:

First we want to set the connection:

Again – naming!!

Then we’re going to go to the ‘Input’ tab and select our input from the Source:

Press OK to save all that, and then double click on the Destination item and go to the Cypher Editor:

First off – you can see the ‘title’ listed in the parameters, so that’s good – Cypher wise we’re doing a MERGE- so we only get one ‘Cloud Atlas’ (because no-one needs more than one of those).

MERGE (:Movie {title: $title})

At this point, we have our two things and no red crosses or errors anywhere, so let’s run it!

Run it!

No surprises – we press ‘Start’ and get the ‘liney’ version of the page which hopefully you see as:

38 rows (hahaha Rows!) and if you go to your ‘Destination’ database you should see the movies there.

I want it

Of course you do – these controls are currently in an open beta, to register to get the controls, please go to:

Using a Data Flow to move data from who knows where to Neo4j in SSIS

In what is rapidly becoming a series of posts – we look into another of the components in the Anabranch SSIS Components for Neo4j package. The last post looked at using the “Execute Cypher Task” from within a Control Flow, but that’s not so useful, I mean – it’s great for doing things like Deleting a DB, adding indexes etc, but when we want to get Data from one source to another, we gotta go all Data Flowy.

I’m working on the principle that you’ve gone through the last post, as well, I’m going to pick up from where we left off, and I make no apologies for my assumptions.

Clear the DB

I should mention – please check which DB instance you are connected to – nothing says ‘problem’ quite like deleting your production database.

Let’s first clear the Neo4j instance back to an empty state, run:


In the browser.

Clear Package

We don’t want the Execute Cypher Task any more, so select it – and press Delete, or go all Mousey and right-click – the choice is yours

Deleting the mouse way

Let’s Data Flow (Task)!

Drag a Data Flow Task onto the Control Flow workspace:

Double click on the Task to be taken to the Data Flow workspace, which will be empty. So let’s drag a ‘Flat File Source’ to the space:

Double click on the Flat File Source, and the editor will pop up. We need to add a new Connection Manager, so press ‘New…’

Now, we want to use a CSV file, you can use the one I use by downloading from this link, it’s not very exciting I’m afraid, just some names πŸ™‚ Anyhews – fill in the details that match your file (the ones in this picture match my file, the only thing I’ve changed from default is the Code page to be 65001 (UTF-8))

Then click on the ‘Columns’ bit on the left hand side, to make sure it all looks ok, and press OK. You’ll be back to the ‘Flat File Source Editor’ – and you should now click on the ‘Columns’ bit here too:

Make sure at least the First/Last names are checked here – obviously if you’re using your own file – pick your columns! Press OK and go back to the workspace.

Now drag an ‘Execute Cypher Destination’ task to the workspace:

Drag the ‘Blue arrow’ from the Flat File Source, and attach it to the Execute Cypher task:

Then, right click on the execute cypher task, and select ‘Show Advanced Editor…’

First, set the connection manager, we want to use our existing Neo4j Connection Manager

Then we want to select the ‘Input Columns’, just pick them all for now:

Press OK, and then Double click on the Execute Cypher Task, to get the Cypher Editor

Add the Cypher as I have above:

CREATE (:User {First: $FirstName, Last: $LastName})

And press OK.

Do some SSISing!

Now, all that’s left to do is press Start (or Right-click – Execute Task) whichever is your preference!

It’ll run, and give you the following:

Which you can check in your DB by running:


Things are a bit more interesting now, as we’re pulling from a different source and putting into the database, obviously SSIS supports loads of sources – with

These controls are currently in an open beta, to register to get the controls, please go to:

Neo4j & SSIS – Connecting and executing Cypher in a Control Flow

Last Friday, Anabranch released the first beta version of it’s connector to Neo4j from SSIS. Aside from a post saying that it existed, I didn’t go into detail, so this is going to be a series of posts on how you can use your existing SSIS infrastructure with Neo4j.

Today we’re going to look at 2 parts of the connector, the Neo4j Connection Manager (CM) and the Execute Cypher Task (ECT). The CM is fundamental to all the controls, without it, you can’t connect to the database. I’ll go into what it does, settings etc in another post, but for now – it’s enough to know that it provides the connection. The ECT allows us to execute Cypher against a given connection manager.

** NOTE **
In version 1.0.0(beta) – the ECT will only work with the first CM you add to the package

This video covers the same topic as the text version below:

I’m going to develop this in Visual Studio 2017, at the time of writing – I found the 2019 SSIS packages to be a bit flakey, whereas the 2017 has been sturdy so far – from a ‘demo’ point of view though – the 2019 process is exactly the same after you have it all installed.

If you’ve never developed against SSIS before, you’ll need a couple of things, firstly SSDT (specifically the Integration services bits), Visual Studio – I think the community edition should work, but I can’t confirm. You’ll also need the Anabranch Ssis Controls for Neo4j – assuming you’ve registered ( and have the download link, you’ll want the 2017 x86 version of the controls – (for VS2019 as well!).

Download and install the controls. NB. You want to install these when Visual Studio isn’t running – as we’re in the heady world of the GAC here, and VS won’t find them unless it’s started with them there.

Do do this example yourself – you’ll also need a Neo4j database instance running, I’d recommend using the Neo4j Desktop as it makes it easier to manage the process.

Create your first package

1. Start up Visual Studio
2. Create a new Integration Services project

New Project…

3. In the new Package.dtsx file, we need to add a Connection Manager. Right click on the bottom ‘Connection Managers’ bar and add a Neo4j connection – if you don’t see it – you might have to restart Visual Studio, or possibly your machine.

Then select the Neo4j Connection:

You’ll now see it in the ‘Connection Managers’ section:

Select it – and change the connection properties to ones that match your database instance – at the moment this is done via the properties window:

At this stage, we have a connection – but we’re not using it, so let’s add a task to execute:

Drag the ‘Execute Cypher Task’ to the Control Flow, and double click on it. Then add the following Cypher:

CREATE (:Node {Id:1})

Press OK

Then we can execute the task:

Once that’s done:

If we go to our Neo4j Database, we can run:


If we look at the ‘Id’ property – we can see it is ‘1’

So. Now we have an SSIS integration package executing against a Neo4j database.

These controls are currently in an open beta, to register to get the controls, please go to:

Neo4j & SSIS

Neo4j and SSIS are awkward bedfellows – SSIS is Microsoft and has connectors to a plethora of database and technologies using ODBC, Web etc, and Neo4j is written in Java which provides a JDBC connection. SSIS however, does not work with JDBC.


Some of the clients I’ve worked with like using SSIS – (some don’t), and value their 20+ years of using a piece of technology, and want to leverage it with new technologies. Nothing says expensive like having to learn a new database and a new ETL tool.

So today I’d like to introduce you to the beta (maybe alpha) version of the Neo4j Connector for SSIS. It uses bolt to securely connect to your Neo4j instance and call Cypher against it.

Version 1 beta features:

  • Neo4j Connection Manager – manages the connection to the database, and securely encrypts your password (and that is actual encryption) making it safe for you to store.
  • Execute Cypher Task – Allows you to execute a piece of Cypher on a Neo4j instance as part of a Control Flow
  • Execute Cypher Destination – Allows you to execute Cypher against a Neo4j instance as part of a Data Flow
  • Both the above pictures show the basic syntax highlighting as well
  • Works with SSIS 2016, 2017 and 2019 (CTP 3)

Do you want to try it? You need the appropriate installer, there are 6 flavours (6!!), if you’re installing on a Server – you’ll probably want the x64 version of the Server version. What? I know – not that clear. If you’re installing on a SQL Server 2016 instance, use the SQL 2016 x64 installer.

To use on a local designer (VS 2017 or 2019) you’ll want the x86 SQL 2017 version – As the integration services addin for VS 2019 still uses the 2017 install locations.

Please go to here: to get a link via email!

Give me feedback and I’ll put more posts up on how to use some of the features shortly!

Actually using the new DataConnector for PowerBI

After I’d written it – I realised my last post was perhaps not the most useful for those who really don’t care about the how but want to know what to do to use it. So this will follow the same deal as with the last post (over a year and a half ago!! WOW!).

Video version below if you want it:

The Setup Steps

First – we’ve got to install PowerBI – now, I didn’t sign up for an account, but downloaded it from the PowerBI website, and installing was simple and quick.

We also need to have Neo4j running, and you can use Community or Enterprise, it matters not – and we’ll want to put the β€˜Movies’ dataset in there, so run your instance, and execute:

:play movies

Add the Data Connector to Power BI

1. First – download the connector from the releases page (or build it yourself in VS) – you want the `Neo4j.mez` file.

Version 1:

2. PowerBI looks for custom connectors in the <USER>\Documents\Power BI Desktop\Custom Connectors folder, which if it doesn’t exist – you’ll need to create. Once you have that folder, copy the Neo4j.mez connector there.


3. Nearly there – we just need to allow PowerBI to load the connector now – so, start up PowerBI and go to the Options dialog:


Once there, select the β€˜Security’ option, and then under the Data Extensions header select the option allowing you to load any extension without validation or warning:


You’ll have to restart PowerBI to get the connector to be picked up – so go ahead and do that now!

Lets Get Some Data!

Now – I know a lot of you will have been excited by the Pie Chart from the last post – now you can create your own!

With a new instance of PowerBI running, let’s select β€˜Get Data’


We can now either search for ‘Neo4j’ or look in the ‘Database’ types:

Select Neo4jthen press ‘Connect’ – aaaaand a warning!

Read it, ignore it – it’s up to you – but this is just to let you know that it’s still in Beta (I mean it’s only had one release so far!) Continue if you’re happy to.

Now you’re given the boxes to enter your Cypher and connection information – the text box for the Cypher field is a single line (ugh) – so if it’s a complicated query – you’re probably best of writing it in Sublime or similar (maybe even Notepad!?!). In this case, we can go simple:

MATCH (m:Movie) RETURN m;

Now, the other settings, if you’re running default settings – you can leave these, but obviously if you need to connect to https instead of http change it in the scheme setting. I’ve filled in my display with the defaults:

When you press OK, you get the Login dialog, ifΒ you are anonymously connecting, then select Anonymous, else – fill in your username / password.

Press ‘Connect’, and PowerBI will connect to your DB and return you back a list of ‘Record’:

We’ll want to ‘Edit’ this, so press ‘Edit’!

When the Power Query Editor opens press the expand column button at the top of the ‘m’ column:

Ooooooh, ‘tagline’, ‘title’ and ‘released’ — our movie properties! For this, I would turn off the ‘use original column name as prefix’ checkbox, leave them all selected and press OK.


Now, let’s ‘Close & Apply’ our query – NB – if you look in the ‘Applied Steps’ section, you can see we only have 2 steps, ‘Source’ and ‘Expanded m’

Whilst PowerBI applies it just think of the Pie charts that lie ahead of us:

When that dialog disappears, we’re good to go! On the right you’ll see a ‘Fields’ section, and you should see something like this:

So, the moment we’ve all waited for…

Let’s Pie Chart

Select ‘Pie Chart’ from the Visualizations section:

Once it’s in your display – select it and drag the ‘released’ field from the Query1 to the ‘details’ field, and then title to the ‘values’ field:

Your chart should look something like this now:

So let’s max size it, and mouse over it, now we can see:

But we can also drag ‘title’ on to the Tooltips field like so to get the First (or last) movie in that group:

What does a query look like under the covers?

Some of us like visual, some like code, the last time we tried this – our query was 20 lines long – our new query though – that’s just 5!

    Source = Neo4j.ExecuteCypher("MATCH (m:Movie) RETURN m;", "http", "localhost", 7474),
    #"Expanded m" = Table.ExpandRecordColumn(Source, "m", {"tagline", "title", "released"}, {"tagline", "title", "released"})
    #"Expanded m"

Which is much nicer.

Hey hey! It’s Beta!

The Data Connector approach gives a much nicer way to query the database, it strips out a lot of the code we have to write, and hopefully makes the querying easier.

BUT – I am not a PowerBI expert – is this the right way to do this? Are there improvements? Some hardcoded queries we should have there? Let me know – do a PR – it’s all good!

PowerBI With Neo4j – How do you build a DataConnector?

Pie Chart of Movies


Repo is at:
Release at:

Looky! Pie Charts!

Pie Chart of Movies

This glorious picture represents the very pinnacle of my PowerBI experience, beforehand I was pulling the data into Excel and charting myself – no longer!

Jokes aside, the big news here is that I’ve dramatically improved upon my previous post where I showed how you could connect to a security enabled Neo4j instance from PowerBI by generating your own base64 encoded string. All in all, that’s a terrible approach, sure – it works, but it’s not really manageable for any real use.

Writing a Power BI data connector.

There are a few guides on this, I found the Microsoft repo on github for Data Connectors to be super handy. In essence you write them in  ‘M Power Query’ which is Power BI’s query language of choice. I opted to write my connector in Visual Studio – so went and got the Power Query SDK extension.

The nice thing about this is that it allows me to test my connector without needing to constantly start/stop PowerBI. So! We get that installed and create a new Data Connector project:

New Project

This gets you a new Data Connector project with two files that you’ll initially care about, a .pq file and a .query.pq file. The latter being a ‘unit test’ file. Let’s first look at the .pqfile.


A .pqfile is simply a PowerQuery file, it’s written in M and if you’re a PowerBI specialist – I assume that’s all good – for a non-PowerBI user (me) it means learning some stuff.

So, if you just F5 the project you should get a swirly thing, followed by an error saying credentials are needed.

Select ‘Anonymous’

Then press ‘Set Credential’ – then press F5 again – results!

OK, so what did we actually run when we pressed F5? Remember the .query.pqfile? That is executing the Contents() query on the default connector.

    result = PQExtension1.Contents()

OK, so far so – hum drum. This is really to get you used to the Power BI development experience. The good news is that we can just copy / paste from the old post I did and we can have a working function – taking into account that (a) we have the same data (movies DB) and the same user pass (neo4j/neo).

[DataSource.Kind="PQExtension1", Publish="PQExtension1.Publish"]
shared PQExtension1.Contents = () =>
    Source = 
                Headers=[Authorization="Basic bmVvNGo6bmVv"],
                Content=Text.ToBinary("{""statements"" : [ {
                        ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
    results = Source[results]

You should get results saying: [Record]which is what you have, if you get that – you have successfully connected to Neo4j! Good job! Now, first things first, let’s strip out the Authorization header and auto generate that.


We have two forms, anonymous and user/pass. For anonymous, we don’t want to send the header, for user/pass – we obviously do. Let’s start with user/pass as we’re already there.

So let’s add another function, it’ll generate the headers for us, let’s firstly hardcode it:

DefaultRequestHeaders = [
    #"Authorization" = "Basic " & Binary.ToText("neo4j:neo")

Changing our function to be:

[DataSource.Kind="PQExtension1", Publish="PQExtension1.Publish"]
shared PQExtension1.Contents = () =>
    Source = 
                //Change HERE vvvvv
                Content=Text.ToBinary("{""statements"" : [ {
                        ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
    results = Source[results]

Pressing F5 will connect and get the same result. So – we know we can do the base64 conversion in PowerBI – this is good. But, still – not ideal to have usernames and passwords hardcoded – for someΒ reason. So let’s let PowerBI get us a user/pass.

Navigate to the ‘Data Source Kind description’ section and UsernamePassword there:

// Data Source Kind description
PQExtension1 = [
    Authentication = [
        // Key = [],
        UsernamePassword = [],
        // Windows = [],
        Implicit = []
    Label = Extension.LoadString("DataSourceLabel")

You can add things like ‘UsernameLabel’ in there if you want (I have, but for the purposes of this – I’m not gonna bother) – they make it look pretty for the PowerBI people out there πŸ™‚

OK, when you connect now (you might have to delete your credentials – the easiest way being selecting the ‘Credentials’ tab in the M Query Output window and ‘Delete Credential’) you will be able to select User/Pass as an option

But hey! We’re not actually using it yet! So let’s get the values from PowerBI using a handy function (that is really hard to find out about) called: Extension.CurrentCredential() with which we can get the username / password, so let’s update our DefaultRequestHeaders to use it:

DefaultRequestHeaders = [
    #"Authorization" = "Basic " & Neo4j.Encode(Extension.CurrentCredential()[Username], Extension.CurrentCredential()[Password])

OK, the dream is alive! For anonymous, basically we want to remove the headers, to do that we need to check what type of authentication is in use, and we’re back to the hilariously undocumented Extension.CurrentCredential method again:

Headers = if Extension.CurrentCredential()[AuthenticationKind] = "Implicit" then null else  DefaultRequestHeaders,

We look for ‘Implicit’ as that’s what Anonymous is – with this we set the Headers to null if we’re anonymous, and the headers if not – ACE!

Getting Stuff

The crux of the whole operation, now we’re able to connect with user/pass and anonymous, it’s probably time we dealt with the hardcoded Cypher. Let’s take in a parameter to our method:

[DataSource.Kind="PQExtension1", Publish="PQExtension1.Publish"]
shared PQExtension1.Contents = (cypher as text) =>
    Source = 
                Content=Text.ToBinary("{""statements"" : [ {
                                //Change HERE vvvvv
                        ""statement"" : "" " & cypher & " ""} ]
    results = Source[results]

Excellent, now we need to change our query.pqfile to call it:

	result = PQExtension1.Contents("MATCH (n) RETURN COUNT(n)")

F5 and see what happens – now we’re passing Cypher to the instance, and getting back results.

This largely covers how to build your own connector, if you look in the source code (and I encourage you to – it’s only 156 lines long including comments) – in it you’ll see I’ve abstracted out some of the stuff we’ve done here, named things properly, and I also pull in the address, port and scheme to allow a user to set it.

Better Know APOC #4 : apoc.coll.sort*

Neo4j.Version 3.3.4
APOC Version

If you haven’t already, please have a look at theΒ introΒ post to this series, it’ll cover stuff which I’m not going to go into on this.

OK, ‘apoc.coll’ has 43 (that’s right – 43) functions and procedures, but I’m only going to cover the ‘sort’ ones for this post – why? Because a post containing 43 different functions – whilst a good % of the overall, would be way too long.

As it is, with ‘sort’ we have 4 functions:

  • apoc.coll.sort
  • apoc.coll.sortMaps
  • apoc.coll.sortMulti
  • apoc.coll.sortNodes

The Whys

These are methods to sort collections, the clue is in the name, but why do we need them? We can sort in Cypher right? We have ‘ORDER BY‘, who wrote this extra bit of code that has no use?? Who?!?!

When Hunger strikes, run for cover

Hunger. Hmmm given his pedigree we may have to assume this was done for a reason… Let’s explore that a bit with the apoc.coll.sort method…


This is your basic sort method, given a collection, return it sorted. It doesn’t matter what type the collection is, it will sort it.


Just the one for in and one for out, the in is the collection to sort, the out is the sorted collection.


We’ll look (for this case) at doing it the traditional Cypher way, and then the APOC way.

The Cypher way

It’s worth seeing the Cypher way so you can appreciate sort, this is based on this question on Stack Overflow.

We’ll have a collection which is defined as such:

WITH [2,3,6,5,1,4] AS collection

Let’s sort this the Cypher way – easy!

WITH [2,3,6,5,1,4] AS collection
 RETURN collection ORDER BY ????

Errr, ok, looks like we’re gonna need to tap into some unwinding!

WITH [2,3,6,5,1,4] AS collection
UNWIND collection AS item
WITH item ORDER BY item
RETURN collect(item) AS sorted

that’s got it! So we UNWIND the collection, then WITH each item (ORDER BY) we then COLLECT them back again.

The APOC way

WITH [2,3,6,5,1,4] AS collection
RETURN apoc.coll.sort(collection) AS sorted

That’s aΒ lot easier to read, it’s also a lot easier to use inline. The Cypher version above might look ok, but imagine you have a more complicated query, and you need to either do multiple sorts, or even just anything extra, it can quickly become unwieldy.


A Map (or Dictionary for those .NETters out there) is the sort of thing we return from Neo4j all the time, and this function allows us to sort on a given property of a Map.


For these examples, we’ll have ‘coll’ defined as:

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll

An array of maps, with an ‘Str‘ property, and a ‘Num‘ property.

Sort by string

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll
RETURN apoc.coll.sortMaps(coll, 'Str')

Returns us a list of the maps, looking like:

β”‚"apoc.coll.sortMaps(coll, 'Str')"                                     β”‚
β”‚","Num":1}]                                                           β”‚

In which we can see the maps go from ‘A’ to ‘D’

Sort by Number

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll
RETURN apoc.coll.sortMaps(coll, 'Str')

Unsurprisingly, this gets us the following:

β”‚"apoc.coll.sortMaps(coll, 'Num')"                                     β”‚
β”‚","Num":4}]                                                           β”‚

Which goes from 1 to 4.

Sort order isΒ Ascending,Β there is no way to do a descending sort. You basically do a ‘reverse’ to get the sort the other way.


This is the equivalent of doing a ‘Sort, Then By’ – so if I take the ‘sortMaps’ function above and run it like so:

WITH [{First:'B', Last:'B'}, {First:'A', Last:'A'}, {First:'B', Last:'A'}, {First:'C', Last:'A'}] AS coll
RETURN apoc.coll.sortMaps(coll, 'First')

I get:

β”‚"apoc.coll.sortMaps(coll, 'First')"                                   β”‚
β”‚:"B"},{"Last":"A","First":"C"}]                                       β”‚

The problem here is the two elements:


I want these to be the other way around, so I have to switch to ‘Multi’:

WITH [{First:'B', Last:'B'}, {First:'A', Last:'A'}, {First:'B', Last:'A'}, {First:'C', Last:'A'}] AS coll
UNWIND apoc.coll.sortMulti(coll, ['^First', '^Last']) AS unwound
RETURN unwound.First AS first, unwound.Last AS last

This get’s me:

β”‚"A"    β”‚  "A" β”‚
β”‚"B"    β”‚  "A" β”‚
β”‚"B"    β”‚  "B" β”‚
β”‚"C"    β”‚  "A" β”‚

One this to note here – (and I thinkΒ it’s quite important) is that this is theΒ only method thatΒ defaults toΒ Descending order. To get Ascending search, you have to prefix columns with a ‘^’ character (as I’ve done in this case).


Nearly there! This takes a collection ofΒ nodes and sorts them onΒ 1 property – so, let’s add some nodes:

CREATE (n1:CollNode {col1: 1, col2: 'D'})
CREATE (n2:CollNode {col1: 2, col2: 'C'})
CREATE (n3:CollNode {col1: 3, col2: 'B'})
CREATE (n4:CollNode {col1: 4, col2: 'A'})

And let’s do a sort:

MATCH (n:CollNode)
WITH apoc.coll.sortNodes(COLLECT(n), 'col2') AS sorted
UNWIND sorted AS n
RETURN n.col1 AS col1, n.col2 AS col2

Now, you could argue this adds little to the party as you can already ORDER BY, and by and large you’re right – the nice thing about the apoc version is that you can call it as I have above, rather than having to do the sort afterwards. Having said that, ORDER BY does have a DESC keyword as well, which sortNodes does not :/


apoc.coll.sort* is useful, that’s the main thrust, some are more useful than others, and there are a few omissions (like the ability to sort desc for all but the sortMulti method) which could be good simple pull requests.

They are what they are, sorting methods πŸ™‚

Neo4j with Azure Functions

Recently, I’ve had a couple of people ask me how to use Neo4j with Azure functions, and well – I’d not done it myself, but now I have – let’s get it done!

  1. Login to your Azure Portal

  2. CreateΒ  a new Resource, and search for β€˜function app’:


  1. Select β€˜Function App’ from the Market Place:


  1. Press β€˜Create’ to actually make one:


  1. Fill in your details as you want them


I’m assuming you’re reasonably au fait with the setting here, in essence if you have a Resource Group you want to put it into (maybe something with a VNet) then go for it, in my case, I’ve just created a new instance of everything.

  1. Create the function, and wait for it to be ready. Mayhaps make a tea or coffee, have a break from the computer for a couple of mins – it’s all good!


  1. When it’s ready, click on it and go into the Function App itself (if it doesn’t take you there!)

  2. Create a new function:


  1. We want to create an HttpTrigger function in C# for this instance:


  1. This gives us a β€˜run.csx’ file, which will have a load of default code, you can run it if you want,


and you’ll see an output window appear which will say:


Well – good – Azure Functions work, so let’s get a connection to a Neo4j instance – now – for this I’m assuming you have an IP to connect to – you can always use the free tier on GrapheneDB if you want to play around with this.

  1. Add references to a driver

We need to add a reference to a Neo4j client, in this case, I’ll show the official driver, but it will work as well with the community driver. First off, we need to add a β€˜project.json’ file, so press β€˜View Files’ on the left hand side –


Then add a file:


Then call it project.json – and yes it has to be that name:


With our new empty file, we need to paste in the nuget reference we need:

Β Β  "frameworks": {
Β Β Β Β  "net46":{
Β Β Β Β Β Β  "dependencies": {
Β Β Β Β Β Β Β Β  "neo4j.driver": "1.5.2"
Β Β Β Β Β Β  }
Β Β Β Β  }
Β Β Β  }

Annoyingly if you copy/paste this into the webpage, the function will add extra β€˜closing’ curly braces, so just delete those.


If you press β€˜Save and Run’ you should get the same response as before – which is good as it means that the Neo4j.Driver package has been installed, if we look at files, we’ll see the β€˜project.json.lock’ file which we want to.


  1. Code

We want to add our connection information now, we’re going to go basic, and just return the COUNT of the nodes in our DB. First we need to add a β€˜using’ statement to our code:

So add,

using Neo4j.Driver.V1;

Then replace the code in the function with:

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
Β Β Β Β  using (var driver = GraphDatabase.Driver("bolt://YOURIP:7687", AuthTokens.Basic("user", "pass")))
Β Β Β Β  {
Β Β Β Β Β Β Β Β  using (var session = driver.Session())
Β Β Β Β Β Β Β Β  {
Β Β Β Β Β Β Β Β Β Β Β Β  IRecord record = session.Run("MATCH (n) RETURN COUNT(n)").Single();
Β Β Β Β Β Β Β Β Β Β Β Β  int count = record["COUNT(n)"].As<int>();
Β Β Β Β Β Β Β Β Β Β Β Β  return req.CreateResponse(HttpStatusCode.OK, "Count: " + count);Β Β Β Β Β Β Β Β Β Β Β Β Β Β Β Β Β  
Β Β Β Β Β Β Β Β  }
Β Β Β Β  }

Basically, we’ll create a Driver, open a session and then return a 200 with the count!

  1. Run

You can now β€˜Save and Run’ and your output window should now tell you the count:


  1. Done

Your first function using Neo4j, Yay!