PowerBI With Neo4j – How do you build a DataConnector?

Pie Chart of Movies

TL;DR;

Repo is at: https://github.com/cskardon/Neo4jDataConnectorForPowerBi
Release at: https://github.com/cskardon/Neo4jDataConnectorForPowerBi/releases

Looky! Pie Charts!

Pie Chart of Movies

This glorious picture represents the very pinnacle of my PowerBI experience, beforehand I was pulling the data into Excel and charting myself – no longer!

Jokes aside, the big news here is that I’ve dramatically improved upon my previous post where I showed how you could connect to a security enabled Neo4j instance from PowerBI by generating your own base64 encoded string. All in all, that’s a terrible approach, sure – it works, but it’s not really manageable for any real use.

Writing a Power BI data connector.

There are a few guides on this, I found the Microsoft repo on github for Data Connectors to be super handy. In essence you write them in  ‘M Power Query’ which is Power BI’s query language of choice. I opted to write my connector in Visual Studio – so went and got the Power Query SDK extension.

The nice thing about this is that it allows me to test my connector without needing to constantly start/stop PowerBI. So! We get that installed and create a new Data Connector project:

New Project

This gets you a new Data Connector project with two files that you’ll initially care about, a .pq file and a .query.pq file. The latter being a ‘unit test’ file. Let’s first look at the .pqfile.

.PQ

A .pqfile is simply a PowerQuery file, it’s written in M and if you’re a PowerBI specialist – I assume that’s all good – for a non-PowerBI user (me) it means learning some stuff.

So, if you just F5 the project you should get a swirly thing, followed by an error saying credentials are needed.

Select ‘Anonymous’

Then press ‘Set Credential’ – then press F5 again – results!

OK, so what did we actually run when we pressed F5? Remember the .query.pqfile? That is executing the Contents() query on the default connector.

let
    result = PQExtension1.Contents()
in
    result

OK, so far so – hum drum. This is really to get you used to the Power BI development experience. The good news is that we can just copy / paste from the old post I did and we can have a working function – taking into account that (a) we have the same data (movies DB) and the same user pass (neo4j/neo).

[DataSource.Kind="PQExtension1", Publish="PQExtension1.Publish"]
shared PQExtension1.Contents = () =>
let
    Source = 
        Json.Document(
            Web.Contents("http://localhost:7474/db/data/transaction/commit",
            [
                Headers=[Authorization="Basic bmVvNGo6bmVv"],
                Content=Text.ToBinary("{""statements"" : [ {
                        ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
                        }")
            ])),
    results = Source[results]
in
    results;

You should get results saying: [Record]which is what you have, if you get that – you have successfully connected to Neo4j! Good job! Now, first things first, let’s strip out the Authorization header and auto generate that.

Authorization

We have two forms, anonymous and user/pass. For anonymous, we don’t want to send the header, for user/pass – we obviously do. Let’s start with user/pass as we’re already there.

So let’s add another function, it’ll generate the headers for us, let’s firstly hardcode it:

DefaultRequestHeaders = [
    #"Authorization" = "Basic " & Binary.ToText("neo4j:neo")
];

Changing our function to be:

[DataSource.Kind="PQExtension1", Publish="PQExtension1.Publish"]
shared PQExtension1.Contents = () =>
let
    Source = 
        Json.Document(
            Web.Contents("http://localhost:7474/db/data/transaction/commit",
            [
                //Change HERE vvvvv
                Headers=DefaultRequestHeaders,
                Content=Text.ToBinary("{""statements"" : [ {
                        ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
                        }")
            ])),
    results = Source[results]
in
    results;

Pressing F5 will connect and get the same result. So – we know we can do the base64 conversion in PowerBI – this is good. But, still – not ideal to have usernames and passwords hardcoded – for some reason. So let’s let PowerBI get us a user/pass.

Navigate to the ‘Data Source Kind description’ section and UsernamePassword there:

// Data Source Kind description
PQExtension1 = [
    Authentication = [
        // Key = [],
        UsernamePassword = [],
        // Windows = [],
        Implicit = []
    ],
    Label = Extension.LoadString("DataSourceLabel")
];

You can add things like ‘UsernameLabel’ in there if you want (I have, but for the purposes of this – I’m not gonna bother) – they make it look pretty for the PowerBI people out there 🙂

OK, when you connect now (you might have to delete your credentials – the easiest way being selecting the ‘Credentials’ tab in the M Query Output window and ‘Delete Credential’) you will be able to select User/Pass as an option

But hey! We’re not actually using it yet! So let’s get the values from PowerBI using a handy function (that is really hard to find out about) called: Extension.CurrentCredential() with which we can get the username / password, so let’s update our DefaultRequestHeaders to use it:

DefaultRequestHeaders = [
    #"Authorization" = "Basic " & Neo4j.Encode(Extension.CurrentCredential()[Username], Extension.CurrentCredential()[Password])
];

OK, the dream is alive! For anonymous, basically we want to remove the headers, to do that we need to check what type of authentication is in use, and we’re back to the hilariously undocumented Extension.CurrentCredential method again:

Headers = if Extension.CurrentCredential()[AuthenticationKind] = "Implicit" then null else  DefaultRequestHeaders,

We look for ‘Implicit’ as that’s what Anonymous is – with this we set the Headers to null if we’re anonymous, and the headers if not – ACE!

Getting Stuff

The crux of the whole operation, now we’re able to connect with user/pass and anonymous, it’s probably time we dealt with the hardcoded Cypher. Let’s take in a parameter to our method:

[DataSource.Kind="PQExtension1", Publish="PQExtension1.Publish"]
shared PQExtension1.Contents = (cypher as text) =>
let
    Source = 
        Json.Document(
            Web.Contents("http://localhost:7474/db/data/transaction/commit",
            [
                Headers=DefaultRequestHeaders,
                Content=Text.ToBinary("{""statements"" : [ {
                                //Change HERE vvvvv
                        ""statement"" : "" " & cypher & " ""} ]
                        }")
            ])),
    results = Source[results]
in
    results;

Excellent, now we need to change our query.pqfile to call it:

let 
	result = PQExtension1.Contents("MATCH (n) RETURN COUNT(n)")
in
	result

F5 and see what happens – now we’re passing Cypher to the instance, and getting back results.

This largely covers how to build your own connector, if you look in the source code (and I encourage you to – it’s only 156 lines long including comments) – in it you’ll see I’ve abstracted out some of the stuff we’ve done here, named things properly, and I also pull in the address, port and scheme to allow a user to set it.

Using PowerBI with Neo4j

There’s an excellent post by Cédric Charlier over at his blog about hooking Neo4j into PowerBI. It’s simple to follow and get’s you up and running, but I (as a PowerBI newbie) had a couple of spots where I ran into trouble – generally with assumptions I think that are made assuming that you know how to navigate around the PowerBI interface. (I didn’t).

So, here is a simple tutorial to get us non-BI people up and running!

I’ve written a Data Connector for Neo4j now – and I would heartily recommend you have a look at the new post I’ve written about it here: http://xclave.co.uk/2019/02/06/actually-using-the-new-dataconnector-for-powerbi/

The Setup Steps

First – we’ve got to install PowerBI – now, I didn’t sign up for an account, but downloaded it from the PowerBI website, and installing was simple and quick.

We also need to have Neo4j running, and you can use Community or Enterprise, it matters not – and we’ll want to put the ‘Movies’ dataset in there, so run your instance, and execute:

:play Movies

Now we’re ready to ‘BI’!

Step 1 – Start Power BI Desktop

This is pretty obvious, but in case you need it – click on the ‘Power BI Desktop’ link in your start menu – or double click on it if you went and put it on the Desktop. Crazy days.

Step 2 – Click on ‘Get Data’

image

That way we can get data!

Step 3 – Select ‘Blank Query’

Why not ‘web’ you ask? Well as we’re going to do some copy/pasting – it’s easier from a blank query point of view.

image

Step 4 – Advanced

In the query editor window that pops up, select ‘Advanced Editor’

image

Step 5 – Get Data!

We’re going to use the same query as Cédric as you can then use this post to augment his, so in the query editor simply paste:

let
    Source = Web.Contents( "http://localhost:7474/db/data/transaction/commit",
             [
                 Content=Text.ToBinary("{
                          ""statements"" : [ {
                          ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
             }")]
             )
in
    Source

Oh noes! The same error as Cédric got – authentication. You can’t send the login details via changing the URL to be something like:

http://user:pass@localhost….

as that also fails, but you can send in the auth as a header, by adding this line:

Headers = [#"Authorization" = "Basic bmVvNGo6bmVv"],

What is this bmVvNGo6bmVv? Well, that’s the base64 encoded user/pass combo – which is a bit uh oh as you have to generate this 🙁

I’ve got two options here – LinqPad and Powershell

LinqPad

Using this bit of C# – obviously – you can write your own C# app in VS or whatever, but typically I use LinqPad for quick scripts.

var username = "neo4j";
var password = "neo";

var encoded = Encoding.ASCII.GetBytes(string.Format("{0}:{1}", username, password));
var base64 = Convert.ToBase64String(encoded);

base64.Dump();

Powershell

This does pretty much the same, but can obviously be run in a Powershell prompt – which is nice!

Param(
    

[string]

$username,

[string]

$password ) $encoder = [system.Text.Encoding]::UTF8 $token = $username + “:” + $password $encoded = $encoder.GetBytes($token) $base64 = [System.Convert]::ToBase64String($encoded) Write-Output $base64

which is then used like:

GetAuthCode.ps1 –username neo4j –password neo

So, with this information, our new ‘Get data’ bit looks like:

let
    Source = Web.Contents( "http://localhost:7474/db/data/transaction/commit",
             [
                 Headers = [#"Authorization" = "Basic bmVvNGo6bmVv"],
                 Content=Text.ToBinary("{
                          ""statements"" : [ {
                          ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
             }")]
             )
in
    Source

which when we ‘preview’ gives us this:

image

Step 6 – Read as Json

Select the ‘localhost’ file and then choose ‘open as Json’ from the top menu:

image

You’ll notice once you’ve done this – your ‘Source’ has changed to now be ‘Json.Document(Web.Contents…)’

image

Step 7 – Navigation

First click on the ‘List’ of ‘Results.

This will take you to a screen that looks like this:

image

Note, you now have another ‘Step’ in the right hand bar – by the way – if you ever ’lose’ the Settings side bar – click on ‘View’ at the top and select ‘Query Settings’ to bring it back.

Then click on the ‘Record’ link, and then the ‘List’ for data:

image

Worth noting here, we’re still in the ‘Navigation’ step

Now you should have a list of ‘Record’s –

image

Step 8 – Table-ify

Go ahead and press the ‘To Table’ button, and then just ‘OK’ on the dialog that pops up:

image

Step 9 – Expand the Column

Records aren’t useful to Power BI (apparently) so – we need to expand that column out and to do that we click on the ‘Expand’ button – and in our case – we only want the ‘row’, not the meta, so unselect the ‘meta’ and press OK

image

Now you should see a row of ‘List’ and an extra step in our ‘Applied Steps’ list:

image

Step 10 – Add a custom column

So now we need to get the information out of these new ‘Lists’ – and to do that we need a custom column, so click on the ‘Custom Column’ button in the ‘Add Column’ tab:

image

In the dialog that pops up we want to have it say:

= Record.FromList([Column1.row], type[Name = text, Rank = number])
image

Then press OK, and you’ll have another Column called ‘Custom’, and another item in our Applied Steps:

image

Step 11 – Expand Custom

More records eh? Let’s expand it out, so as before, click on the ‘Expand’ button:

image

and in this case, we want all the columns:

image

Now you should have two new columns, and another step added:

image

Data! Yay!

Step 12 – Remove that non-useful row

Right click on the ‘Column1.row’ column and select Remove

image

Step 13 – Close & Apply

Now we have data in a format we can use in Power BI, let’s close and apply that query.

image

Step 14 – Use that data

Now – I’m no Power BI user – so this is super simple and pointless, but should get you going for experimenting.

After applying that query we’re back in the main desktop view, but now in the right hand side – we have some fields with our Query there:

image

Let’s VISUALIZE

I’m going to pick a ‘Treemap’ – because.

image

Empty treemap – Check!

image

Let’s set some data, I want to group by ‘Rank’, so I drag ‘Custom.Rank’ to the ‘Group’ section which is in the ‘Visualizations’ bar:

image

And then for ‘Values’ I’m going to drag the ‘Custom.Name’ field

image

Oooooh – colours:

image

Let’s expand our visualization by pressing the ‘Focus Mode’ button:

image

Boom! Full size

Now, if I hover over one of those boxes I get the brief info displayed:

image

Ace, only 2 names with a rank of 5, and to see who they are, right click and select ‘See Records’

image

And here they are:

image

No More Steps

If you want to just copy/paste the code, you can! Create a new blank query and open up the advanced editor and just paste the code below in. (NB There are probably loads of things which are rubbish about this implementation, lemme know!)

let
    Source = 
        Json.Document(
            Web.Contents("http://localhost:7474/db/data/transaction/commit",
            [
                Headers=[Authorization="Basic bmVvNGo6bmVv"],
                Content=Text.ToBinary("{""statements"" : [ {
                        ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
                        }")
            ])),
    results = Source[results],
    results1 = results{0},
    data = results1[data],
    #"Converted to Table" = Table.FromList(data, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"row"}, {"Column1.row"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Column1", "Custom", each Record.FromList([Column1.row], type[Name = text, Rank = number])),
    #"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom", "Custom", {"Name", "Rank"}, {"Custom.Name", "Custom.Rank"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Column1.row"})
in
    #"Removed Columns"