Better Know APOC #4 : apoc.coll.sort*

Neo4j.Version 3.3.4
APOC Version 3.3.0.2

If you haven’t already, please have a look at the intro post to this series, it’ll cover stuff which I’m not going to go into on this.


OK, ‘apoc.coll’ has 43 (that’s right – 43) functions and procedures, but I’m only going to cover the ‘sort’ ones for this post – why? Because a post containing 43 different functions – whilst a good % of the overall, would be way too long.

As it is, with ‘sort’ we have 4 functions:

  • apoc.coll.sort
  • apoc.coll.sortMaps
  • apoc.coll.sortMulti
  • apoc.coll.sortNodes

The Whys

These are methods to sort collections, the clue is in the name, but why do we need them? We can sort in Cypher right? We have ‘ORDER BY‘, who wrote this extra bit of code that has no use?? Who?!?!

When Hunger strikes, run for cover

Hunger. Hmmm given his pedigree we may have to assume this was done for a reason… Let’s explore that a bit with the apoc.coll.sort method…

apoc.coll.sort

This is your basic sort method, given a collection, return it sorted. It doesn’t matter what type the collection is, it will sort it.

Parameters

Just the one for in and one for out, the in is the collection to sort, the out is the sorted collection.

Examples

We’ll look (for this case) at doing it the traditional Cypher way, and then the APOC way.

The Cypher way

It’s worth seeing the Cypher way so you can appreciate sort, this is based on this question on Stack Overflow.

We’ll have a collection which is defined as such:

WITH [2,3,6,5,1,4] AS collection

Let’s sort this the Cypher way – easy!

WITH [2,3,6,5,1,4] AS collection
 RETURN collection ORDER BY ????

Errr, ok, looks like we’re gonna need to tap into some unwinding!

WITH [2,3,6,5,1,4] AS collection
UNWIND collection AS item
WITH item ORDER BY item
RETURN collect(item) AS sorted

that’s got it! So we UNWIND the collection, then WITH each item (ORDER BY) we then COLLECT them back again.

The APOC way

WITH [2,3,6,5,1,4] AS collection
RETURN apoc.coll.sort(collection) AS sorted

That’s a lot easier to read, it’s also a lot easier to use inline. The Cypher version above might look ok, but imagine you have a more complicated query, and you need to either do multiple sorts, or even just anything extra, it can quickly become unwieldy.

apoc.coll.sortMaps

A Map (or Dictionary for those .NETters out there) is the sort of thing we return from Neo4j all the time, and this function allows us to sort on a given property of a Map.

Examples

For these examples, we’ll have ‘coll’ defined as:

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll

An array of maps, with an ‘Str‘ property, and a ‘Num‘ property.

Sort by string

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll
RETURN apoc.coll.sortMaps(coll, 'Str')

Returns us a list of the maps, looking like:

╒══════════════════════════════════════════════════════════════════════╕
│"apoc.coll.sortMaps(coll, 'Str')"                                     │
╞══════════════════════════════════════════════════════════════════════╡
│[{"Str":"A","Num":4},{"Str":"B","Num":3},{"Str":"C","Num":2},{"Str":"D│
│","Num":1}]                                                           │
└──────────────────────────────────────────────────────────────────────┘

In which we can see the maps go from ‘A’ to ‘D’

Sort by Number

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll
RETURN apoc.coll.sortMaps(coll, 'Str')

Unsurprisingly, this gets us the following:

╒══════════════════════════════════════════════════════════════════════╕
│"apoc.coll.sortMaps(coll, 'Num')"                                     │
╞══════════════════════════════════════════════════════════════════════╡
│[{"Str":"D","Num":1},{"Str":"C","Num":2},{"Str":"B","Num":3},{"Str":"A│
│","Num":4}]                                                           │
└──────────────────────────────────────────────────────────────────────┘

Which goes from 1 to 4.

Sort order is Ascending, there is no way to do a descending sort. You basically do a ‘reverse’ to get the sort the other way.

apoc.coll.sortMulti

This is the equivalent of doing a ‘Sort, Then By’ – so if I take the ‘sortMaps’ function above and run it like so:

WITH [{First:'B', Last:'B'}, {First:'A', Last:'A'}, {First:'B', Last:'A'}, {First:'C', Last:'A'}] AS coll
RETURN apoc.coll.sortMaps(coll, 'First')

I get:

╒══════════════════════════════════════════════════════════════════════╕
│"apoc.coll.sortMaps(coll, 'First')"                                   │
╞══════════════════════════════════════════════════════════════════════╡
│[{"Last":"A","First":"A"},{"Last":"B","First":"B"},{"Last":"A","First"│
│:"B"},{"Last":"A","First":"C"}]                                       │
└──────────────────────────────────────────────────────────────────────┘

The problem here is the two elements:

{"Last":"B","First":"B"},{"Last":"A","First":"B"}

I want these to be the other way around, so I have to switch to ‘Multi’:

WITH [{First:'B', Last:'B'}, {First:'A', Last:'A'}, {First:'B', Last:'A'}, {First:'C', Last:'A'}] AS coll
UNWIND apoc.coll.sortMulti(coll, ['^First', '^Last']) AS unwound
RETURN unwound.First AS first, unwound.Last AS last

This get’s me:

╒═══════╤══════╕
│"first"│"last"│
╞═══════╪══════╡
│"A"    │  "A" │
├───────┼──────┤
│"B"    │  "A" │
├───────┼──────┤
│"B"    │  "B" │
├───────┼──────┤
│"C"    │  "A" │
└───────┴──────┘

One this to note here – (and I think it’s quite important) is that this is the only method that defaults to Descending order. To get Ascending search, you have to prefix columns with a ‘^’ character (as I’ve done in this case).

apoc.coll.sortNodes

Nearly there! This takes a collection of nodes and sorts them on 1 property – so, let’s add some nodes:

CREATE (n1:CollNode {col1: 1, col2: 'D'})
CREATE (n2:CollNode {col1: 2, col2: 'C'})
CREATE (n3:CollNode {col1: 3, col2: 'B'})
CREATE (n4:CollNode {col1: 4, col2: 'A'})

And let’s do a sort:

MATCH (n:CollNode)
WITH apoc.coll.sortNodes(COLLECT(n), 'col2') AS sorted
UNWIND sorted AS n
RETURN n.col1 AS col1, n.col2 AS col2

Now, you could argue this adds little to the party as you can already ORDER BY, and by and large you’re right – the nice thing about the apoc version is that you can call it as I have above, rather than having to do the sort afterwards. Having said that, ORDER BY does have a DESC keyword as well, which sortNodes does not :/

Conclusions

apoc.coll.sort* is useful, that’s the main thrust, some are more useful than others, and there are a few omissions (like the ability to sort desc for all but the sortMulti method) which could be good simple pull requests.

They are what they are, sorting methods 🙂

Neo4j with Azure Functions

Recently, I’ve had a couple of people ask me how to use Neo4j with Azure functions, and well – I’d not done it myself, but now I have – let’s get it done!

  1. Login to your Azure Portal

  2. Create  a new Resource, and search for ‘function app’:

image

  1. Select ‘Function App’ from the Market Place:

image

  1. Press ‘Create’ to actually make one:

image

  1. Fill in your details as you want them

image

I’m assuming you’re reasonably au fait with the setting here, in essence if you have a Resource Group you want to put it into (maybe something with a VNet) then go for it, in my case, I’ve just created a new instance of everything.

  1. Create the function, and wait for it to be ready. Mayhaps make a tea or coffee, have a break from the computer for a couple of mins – it’s all good!

image

  1. When it’s ready, click on it and go into the Function App itself (if it doesn’t take you there!)

  2. Create a new function:

image

  1. We want to create an HttpTrigger function in C# for this instance:

image

  1. This gives us a ‘run.csx’ file, which will have a load of default code, you can run it if you want,

image

and you’ll see an output window appear which will say:

image

Well – good – Azure Functions work, so let’s get a connection to a Neo4j instance – now – for this I’m assuming you have an IP to connect to – you can always use the free tier on GrapheneDB if you want to play around with this.

  1. Add references to a driver

We need to add a reference to a Neo4j client, in this case, I’ll show the official driver, but it will work as well with the community driver. First off, we need to add a ‘project.json’ file, so press ‘View Files’ on the left hand side –

image

Then add a file:

image

Then call it project.json – and yes it has to be that name:

image

With our new empty file, we need to paste in the nuget reference we need:

{
   "frameworks": {
     "net46":{
       "dependencies": {
         "neo4j.driver": "1.5.2"
       }
     }
    }
}

Annoyingly if you copy/paste this into the webpage, the function will add extra ‘closing’ curly braces, so just delete those.

image

If you press ‘Save and Run’ you should get the same response as before – which is good as it means that the Neo4j.Driver package has been installed, if we look at files, we’ll see the ‘project.json.lock’ file which we want to.

image

  1. Code

We want to add our connection information now, we’re going to go basic, and just return the COUNT of the nodes in our DB. First we need to add a ‘using’ statement to our code:

So add,

using Neo4j.Driver.V1;

Then replace the code in the function with:

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
     using (var driver = GraphDatabase.Driver("bolt://YOURIP:7687", AuthTokens.Basic("user", "pass")))
     {
         using (var session = driver.Session())
         {
             IRecord record = session.Run("MATCH (n) RETURN COUNT(n)").Single();
             int count = record["COUNT(n)"].As<int>();
             return req.CreateResponse(HttpStatusCode.OK, "Count: " + count);                  
         }
     }
}

Basically, we’ll create a Driver, open a session and then return a 200 with the count!

  1. Run

You can now ‘Save and Run’ and your output window should now tell you the count:

image

  1. Done

Your first function using Neo4j, Yay!

Neo4jClient turns 3.0

Well, version wise anyhow!

This is a pretty big release and is one I’ve been working on for a while now (sorry!). Version 3.0 of the client finally adds support for Bolt. When Neo4j released version 3.0 of their database, they added a new binary protocol called Bolt designed to be faster and easier on the ol’ network traffic.

For all versions of Neo4jClient prior to 3.x you could only access the DB via the REST protocol (side effect – you also had to use the client to access any version of Neo4j prior to 3.x).

I’ve tried my hardest to minimise the disruption that could happen, and I’m pretty happy to have it down to mainly 1 line change (assuming you’re passing an IGraphClient around – you are right??)

So without further ado:

var client = new GraphClient(new Uri("http://localhost:7474/db/data"), "user", "pass");`

becomes:

var client = new BoltGraphClient(new Uri("bolt://localhost:7687"), "user", "pass");

I’ve gone through a lot of trials with various objects, and I know others have as well (thank you!) but I’m sure there will still be errors – nothing is bug free! So raise an issue – ask on StackOverflow or Twitter 🙂

If you’re using `PathResults` you will need to swap to `BoltPathResults` – the format changed dramatically between the REST version and the Bolt version – and there’s not a lot I can do about it I’m afraid!

Better Know APOC #2: apoc.graph.fromDB

A rubbish picture that really doesn't represent anything - but you could say 'apoc.graph.fromDB' - honestly - you're not missing anything here

Neo4j.Version 3.0.0
APOC Version 3.3.0.1

If you haven’t already, please have a look at the intro post to this series, it’ll cover stuff which I’m not going to go into on this.

Back at the beginning of this series (if you can remember that far!) I talked about using apoc.export.csv.* – and I showed that an example of using apoc.export.csv.graph that took in a graph – and to get that graph – I used apoc.graph.fromDB. I also said I wasn’t going to cover it in that post – and I didn’t. Time to rectify that lack of knowledge!

What does it do?

apoc.graph.fromDB takes your existing DB and creates a whole new virtual graph for your use later on – we’ve seen it in use in episode 1 – the phantom men… sorry – apoc.export.csv.graph, but a virtual graph can be used in other procedures . This particular instance is a hefty ‘catch all’ version – maybe overkill for most needs – but equally – maybe exactly what you’re after (if you’re after dumping your DB).

Setup – Neo4j.conf

dbms.security.procedures.unrestricted=apoc.graph.fromDB

Ins and Outs

Calling apoc.help(‘apoc.graph.fromDB’) get’s us:

Inputs (name :: STRING?, properties :: MAP?) ::
Outputs (graph :: MAP?)

Inputs

Only two this time, and I reckon you can pretty much ignore them, so that’s a win?!

Name

This is as simple as it seems – just the name – I’m going to be honest here – I really am not sure what this is for – you can access it later on though. I’m pretty sure this is a hangover from the other apoc.graph.from* methods – where it makes more sense as a distinguisher – but for this procedure – as we’re just exporting the whole db, go for whatever you like.

Properties

Just a collection of key/values – accessible after the procedure has executed – but otherwise not used by the procedure.

Outputs

Just the one! Amazeballs!

Graph

This is what you need to YIELD to use the procedure (the examples will cover this) – to access the name you use:

RETURN graph.name

To get your properties it’s:

RETURN graph.properties.<your-property-here>

Examples

Assuming as always that you have the Movies DB setup and ready to roll, just call:

CALL apoc.graph.fromDB('Movies', null) YIELD graph

That will pop the whole DB into your browser – now if you do this with a MONSTER database, you’ll only see the first 300 nodes – otherwise no matter your browser you could expect epic failures.

Typically we want to RETURN something rather than just put it on the screen so:

CALL apoc.graph.fromDB('A1Graphs', null) YIELD graph
RETURN *

Oh look – exactly the same – HANDY.

Let’s (for the sake of something) pretend that we have 2 of these and we’re wanting to check the name:

CALL apoc.graph.fromDB('A1Graphs', null) YIELD graph
RETURN graph.name

That’ll get us:

image

(and the award for the dullest blog post picture goes to..)

Let’s set and get some properties:

CALL apoc.graph.fromDB('A1Graphs', {Hello: 'World'}) YIELD graph
RETURN graph.properties

Which returns

image

But if we just want one property:

CALL apoc.graph.fromDB('A1Graphs', {Hello: 'World'}) YIELD graph
RETURN graph.properties.Hello

Note – I’ve used an upper case property name, so I have to use the same case when pulling them out – (I refuse to be cowed into Java conventions). Anyhews, that returns:

image

Notes

You always  need to YIELD unless you literally want to dump your DB to the screen – doing something like:

CALL apoc.graph.fromDB('A1Graphs', null) AS myGraph

Will lead to exceptions – as Neo4j is expecting you to YIELD, you can do:

CALL apoc.graph.fromDB('A1Graphs', null) YIELD graph AS myGraph

and use myGraph throughout the rest of your code no worries.

Better Know APOC #1: apoc.export.csv.*

Export CSV from Neo4j with APOC

Neo4j Version 3.3.0
APOC Version 3.3.0.1

If you haven’t already, please have a look at the intro post to this series, it’ll cover stuff which I’m not going to go into on this.

We’re going to start with the Export functions – in particular exporting to CSV, mainly as people have asked about it – and well – you’ve got to start somewhere.

apoc.export.csv.*

There are 4 functions documented on the GitHub.IO page:

  • apoc.export.csv.query(query, file, config)
  • apoc.export.csv.all(file, config)
  • apoc.export.csv.data(nodes, rels, file, config)
  • apoc.export.csv.graph(graph, file, config)

All of them export a given input to a CSV file, specified by the file parameter.

Setup

There’s a couple of things we need to have in place to use these methods.

Neo4j.conf

We need to let Neo4j know that you allow it to export, and also run the export csv procedures:

apoc.export.file.enabled=true
 dbms.security.procedures.unrestricted=apoc.export.csv.*

 

apoc.export.csv.query

In no particular order (aside from the docs order) we’ll look at the .query version of export.csv.  This procedure takes a given query and exports is to a csv file – the format of the RETURN statement in the query directly affects the output, so if you return nodes, you get full node detail.

From the help procedure we get the following for the signature:

Inputs (query :: STRING?, file :: STRING?, config :: MAP?)
Outputs (file :: STRING?, source :: STRING?, format :: STRING?, nodes :: INTEGER?, relationships :: INTEGER?, properties :: INTEGER?, time :: INTEGER?, rows :: INTEGER?)

Inputs

Query

I hope this is obvious – but – it’s the query you want to use to get your CSV columns – personally I write the query first, make sure it’s working then simple copy/paste into my apoc call, so let’s say I want to get all the Movies a given Person (Tom Hanks) has ACTED_IN, I would do:

MATCH (p:Person {name: ‘Tom Hanks’})-[:ACTED_IN]->(m:Movie) RETURN m.title, m.released

File

This is the filename to export to – I always go fully qualified, but should you want to go relative, it’s relative to the Neo4j home directory

Config

There is only one config setting that affects this procedure:


Parameter Description
d Sets the delimiter for the export, this can only be one character, so something like ‘-‘ is ok, or ‘\t’ (for tabs)

Outputs

File

This is what you passed in, It doesn’t give you the fully qualified location, just what you passed in.

Source

This will say something like ‘statement: cols(2)’ to indicate the query returned 2 columns, I don’t think I need to explain that the number will change depending on what you return.

Format

Always gonna be csv

Nodes

If you’re returning nodes instead of just properties, this will give you the count of nodes you are exporting.

Relationships

This returns the count of the relationships being returned.

Properties

Will be 0 if you’re just returning nodes, otherwise will be a total count of all the properties returned. So if you’re returning 2 properties, but from 12 nodes, you get 24 properties.

Time

How long the export took in milliseconds – bear in mind – this will be less than the total query time you’ll see in something like the browser, due to rendering etc

Rows

The number of rows returned by the query and put into the CSV file – directly matches the number of lines you’ll have in your CSV file.

Examples

We want to export all the Movie titles and release years for files Tom Hanks has ACTED_IN – we’ve already got that query in place (up above) so lets put it into export.csv:

CALL apoc.export.csv.query(
     "MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie) RETURN m.title, m.released", 
     "c:/temp/exportedGraph.csv", 
     null
)

I like to put the parameters onto different lines, you can whack it all in one line if that’s your fancy! I’m passing null for the config at the moment, as there’s no change I want to make.If I run this, I will get:

image

We can change that and return just the ‘m’:

CALL apoc.export.csv.query(
    "MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie) RETURN m", 
    "c:/temp/exportedGraph.csv", 
     null
 )

Which gives a lot more detail – about all of the node:

image

OK, let’s play with the parameter, now I’m a big fan of the tab delimited format, so let’s make that happen:

CALL apoc.export.csv.query(
     "MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie) RETURN  m.title, m.released", 
     "c:/temp/exportedGraph.csv", 
     {d:'\t'}
)

That gets us:

image

apoc.export.csv.all(file, config)

A lot of the following detail is the same as for the above procedure, so this will be short (in fact the next couple as well).

Config is the same – just one parameter to play with – ‘d’ or delimiter. What we don’t have is the ‘query’ parameter anymore – now we get the entire content of the database in one go – boom!

//Tab delimiting the 'all' query
CALL apoc.export.csv.all(
   "c:/temp/exportedGraph.csv", 
   {d:'\t'}
)

What you’ll find is that as you have no control over the format of the result, the authors of APOC have kindly pulled all the properties out, so if you run it against our Movies database, you get:

"_id","_labels","name","born","released","title","tagline","_start","_end","_type","roles","rating","summary"

as the header, and obviously all the rows are all the nodes!

But hang on.. Movie doesn’t have a born property – (at least in our DB). If we look at our DB we actually have a Movie node and a Person node, and all is dumping out everything – when you scroll down the file you’ll see the different rows have their _labels property.

(To be said in a Columbo voice) “Just one last thing”… we also get the relationship details – if you scroll down the file you’ll see rows like:

,,,,,,,"322","325","ACTED_IN","[""Neo""]","",""

These are the relationships – so you really are getting everything.

apoc.export.csv.data(nodes, rels, file, config)

OK, in this one we’re looking to pass a collection of Nodes and Relationships to be exported, but how do we get those nodes and relationships? With a query!

MATCH (m:Movie)<-[r:ACTED_IN]-(p:Person {name:'Tom Hanks'})
WITH COLLECT(m) AS movies, COLLECT(r) AS actedIn
CALL apoc.export.csv.data(
 movies, 
 actedIn, 
 'c:/temp/exportedGraph.csv', 
 null
) YIELD file, nodes, relationships
RETURN file, nodes, relationships

A couple of things to note, in line 2, we COLLECT the m and r values so we can pass them to the APOC procedure.

You may ask “Hey Chris, why exactly are you YIELDing? We didn’t need to do that before”, and you’re right of course. But because we have a MATCH query at the top the Cypher parser won’t let us end with a CALL clause, so we need to YIELD the result from the CALL and then RETURN those results as well (we’ll have to do this with the next procedure as well).

Now we’ve got the nodes and relationships exported we’ll find when we look at our CSV that it’s taken the same approach as the all procedure and picked out the properties so we’ll end up with a header like:

"_id","_labels","title","tagline","released","_start","_end","_type","roles"

Which makes a LOAD CSV call super easy later on – should that be your dream goal (cypher dreams are reserved for Michael Hunger only 🙂 ).

Config wise – it’s the same as the others, only the ‘d’ option for the delimiter.

apoc.export.csv.graph(graph, file, config)

This procedure exports a graph – simple really. But what is a graph when it’s a home? Well that leads to another APOC procedure – apoc.graph.fromDB which we’ll not cover here, needless to say Imma going to use it:

CALL apoc.graph.fromDB('movies', {}) YIELD graph
CALL apoc.export.csv.graph(
 graph,
 'c:/temp/exportedGraph.csv',
 null
 ) YIELD file, nodes, relationships
RETURN file, nodes, relationships

Say WHAT??!

OK, simples – the first line exports the whole database (I’m not doing any filtering here) into a graph identifier which I then pass into the call to export.

This exports in exactly the same way as the all version – so if you pass an entire DB – you get an entire DB. In fact – the code above does exactly the same as the call to all just in a more complicated way, and in keeping with that theme, the old faithful ‘d’ for delimiter config parameter is there for your use.

Summary

OK, we’ve looked at all the export.csv procedures now, (unless in the time I’ve taken to write this they’ve put another one in) and hopefully I’ve cleared up a few questions about how it works and what exactly the configuration options are. If not, let me know and I’ll augment.

Better Know APOC #0: Introduction

APOC

As all developers in good languages know – arrays and blog series start at 0, not 1 (sorry VB developers) so this is the introductory post to the upcoming series about APOC and it’s procedures etc.

What is APOC?

APOC stands for Awesome Procedures On Cypher – and is a large collection of community functions and procedures for use in Neo4j. You can read more about it on the page linked to above. Also – just so you know – I’m going to refer to them as procedures to save writing ‘functions and procedures’ all the time – because I’m lazy.

Setup for these posts

A lot of this stuff is listed on the APOC page, but use this as a quick get up to speed guide for when you’re reading the subsequent posts on here.

Setup – Configuration

You’ll need to setup your DB to be able to run the procs. We do this by adding a configuration property to your neo4j.conf file. The simplest (but most insecure) would be to add:

dbms.security.procedures.unrestricted=apoc.*

which will allow you to run all the APOC procedures. In the posts I’ll give a more specific version so you can execute just the ones we’re looking at, but if you have the above configuration property in your config – you can ignore that part of the posts.

Some of the procedures (export/import generally) require an additional configuration property, but that’ll be covered in the posts.

Setup – Data

90% of the time, if I can get away with it – I’m going to use the ‘Movies’ example database, it has many benefits – it’s a well-known type of data (most people know Movies have Actors etc) and it’s available to everyone who has Neo4j running.

To get the data you run:

:play movies

In the Neo4j Browser. Press the ‘next’ arrow and run the first query you see. You don’t need to do the rest.

Common stuff

The APOC procedures generally follow a common pattern, and there are some things you can do to help yourself

Get help

This is the first and most basic thing you can do – APOC has a built in help procedure, which you can run to get the signature of the procedure you’re interested in. In the posts the signatures will come from this method:

CALL apoc.help(‘NAME OF PROC’)

This will return a table with 5 columns:

  • Type: Whether it’s a procedure or function – this is important so you know how to call the thing
  • Name: If you’ve put in a non-complete name for help, this will tell you which proc you’re actually looking at
  • Text: Erm
  • Signature: This is the signature of the procedure (the most important bit from our POV)
  • Roles: If you have security setup – this is which roles can see/use this procedure
  • Writes: Ahhh

The signature is the main thing and that can be broken down into 2 parts, innies and outies, initially it can look a bit daunting as it’s one big lump of text, but break it down and it becomes easier:

image

Type wise – they’re all Java types, so easy to understand (Map = Dictionary .Netters).

Now you have the info – you have to pass all the parameters in – i.e. if there are 3 parameters, you can’t just pass in 2 and default the 3rd, you have to pass in all 3.

The Config Parameter

Pretty much all of the procedures will have this parameter:

config :: MAP?

This is a general config dictionary that allows the procedures to take in a large number of parameters without making the parameters list 1/2 a mile long (that’s 0.8KM my European friends).

It’s a Map<String, Object> and to set it when calling in Cypher you use the curly brace technique, so let’s say I want to set a parameter called ‘verbose’ to ‘true’ I would put:

{ verbose:true }

Easy – adding another parameter like (for example) ‘format’ and ‘json’ I would do:

{ verbose: true, format: ‘json’ }

One thing to bear in mind if you decide to look through the source code (and you should to see what these things are doing) is that the Config isn’t always just for the method you’re looking at.

For example ExportConfig is used by:

  • apoc.export.csv.*
  • apoc.export.cypher.*
  • apoc.export.graphml.*
  • etc

So whilst ExportConfig might well list 7 properties – the method you’re looking at may only actually use one of them – and setting the others will have no effect.

Excel & Neo4j? Let’s code that! (VSTO edition)

So you have a new Graph Database, it’s looking snazzy and graphy and all that, but, well – you really want to see it in a tabular format, ‘cos you’ve got this Excel program hanging about, and well – who doesn’t love a bit of tabular data?

Obviously there are loads of reasons why you might want that data – maybe to do a swanky graph, perhaps to pass over to the boss. You can also get that data into Excel in a few ways –

  • Open up from the Web – you can try opening up the REST endpoint in Excel directly – (I say try because quite frankly – it’s not looking like a good option)
  • Create an application to export to CSV – this is easy – writing a CSV/TSV/#SV is a doddle (in any language) but does mean you have to give it to people to run, and that might give more headaches – however it’s an option!
  • Create an Excel Addin that runs within Excel – slightly more complicated as you need to interact with Excel directly – but does have the benefit that maybe you can use it to send data back to the db as well..

As you can imagine, this is about doing the third option – to be honest, I would only ever pick options 2 or 3, and if I’m super honest – I would normally go for option 2 – as it’s the simplest. Option 3 however has some benefits I’d like to explore.

If you want to look at the project – you can find it at: https://github.com/DotNet4Neo4j/Neo4jDriverExcelAddin

I’ll be using the official driver (Neo4j.Driver) and VSTO addins, with VS 2017.

Onwards!

Sidenote

As I was writing this, I was going to do my usual – step-by-step approach, so went to take a screenshot and noticed this:

image

So we’re going to do a quick overview of the VSTO version, then the next post will tuck into the Excel Web version which looks snazzier – but I don’t have an example as of yet…

Onwards again!

Sidenote 2: Sidenote Harder

As the code is on github I’m not going to show everything, merely the important stuff, as you can get all the code and check it out for yourself!

So – pick the new VSTO addin option:

image

And create your project. You’ll end up with something like this:

image

OK, so an addin needs a few things –

  1. A button on the ribbon
  2. A form (yes, WinForm) to get our input (cypher)
  3. The code that executes stuff

The Form

That’s right. Form. Actually – UserControl, but still WinForms (Hello 2000), let’s add our interface to the project, right click and ‘Add New Item’:

image

For those who’ve not had the pleasure before, the key thing to learn is how the Anchors work to prevent your form doing weird stuff when it’s resized.

Add a textbox to the control:

image

Single line eh? That’s not very useful – let’s MULTI-LINE!

Right–click on the box and select properties and that properties window you never use pops up, ready to be used! Change the name to something useful – or leave it  – it’s up to you – the key settings are Anchor and Multiline. Multline should be true, Anchor should then be all the anchors:

image

If you resize your whole control now, you should see that your textbox will expand and contract with it – good times!

Drag a button onto that form and place it to the bottom right of your textbox, and now we need to set the anchors again, but this time to Bottom, Right so it will move with resizing correctly – also we should probably change the Text to something more meaningful than button1 – again – don’t let me preach UX to you! Play around, make it bigger, change the colour, go WILD.

Once your button dreams have been realised – double click on the button to be taken to the code behind.First we’ll add some custom EventArgs:

internal class ExecuteCypherQueryArgs : EventArgs
{
     public string Cypher { get; set; }
}

and then a custom EventHandler:

internal EventHandler<ExecuteCypherQueryArgs> ExecuteCypher;

Then we call that event when the button is pressed, so the UserControl code looks like:

public partial class ExecuteQuery : UserControl
{
    internal EventHandler<ExecuteCypherQueryArgs> ExecuteCypher;

    public ExecuteQuery()
    {
        InitializeComponent();
    }

    private void _btnExecute_Click(object sender, EventArgs e)
    {
        if (string.IsNullOrWhiteSpace(_txtCypher.Text))
            return;

        ExecuteCypher?.Invoke(this, new ExecuteCypherQueryArgs { Cypher = _txtCypher.Text });
    }
}

The Ribbon

OK, we now have a form, but no way to see said form, so we need a Ribbon. Let’s add a new  Ribbon (XML) to our project

image

Open up the new .xml file and add the following to the &lt;group&gt; elements:

<button id="btnShowHide" label="Show/Hide" onAction="OnShowHideButton"/>

Now open the .cs file that has the same name as your .xml and add the following:

internal event EventHandler ShowHide;

public void OnShowHideButton(Office.IRibbonControl control)
{
    ShowHide?.Invoke(this, null);
}

Basically, we raise an event when the button is pressed. But what is listening for this most epic of notifications??? That’s right.. it’s:

ThisAddin.cs

The unfortunate part about going from here on in is that this is largely plumbing… ugh! The code around how to show/hide a form I’ll skip over – it’s all in the GitHub repo and you can read it easily enough.

There are a couple of bits of interest – one is the ThisAddin_Startup method, in which we create our Driver instance:

private void ThisAddIn_Startup(object sender, EventArgs e)
{
     _driver = GraphDatabase.Driver(new Uri("bolt://localhost"), AuthTokens.Basic("neo4j", "neo"));
}

To improve this, you’d want to get the URL and login details from the user somehow, perhaps a settings form – but I’ll leave that to you! – The important bit is that we store the IDriver instance in the addin. We only want one instance of a Driver per Excel, so this is fine.

The other interesting method is the ExecuteCypher method – (which is hooked up to in the InitializePane method) – This takes the results of our query and puts it into Excel:

private void ExecuteCypher(object sender, ExecuteCypherQueryArgs e)
{
    var worksheet = ((Worksheet) Application.ActiveSheet);

    using (var session = _driver.Session())
    {
        var result = session.Run(e.Cypher);
        int row = 1;
        
        foreach (var record in result)
        {
            var range = worksheet.Range[$"A{row++}"]; //TODO: Hard coded range
            range.Value2 = record["UserId"].As<string>(); //TODO: Hard coded 'UserId' here.
        }
     }
}

Again – HardCoded ranges and ‘Columns’ (UserId) – you’ll want to change these to make sense for your queries, or even better, just make them super generic.

Summing Up

So now we’re at this stage, we have an Excel addin using VSTO that can call Cypher and display the results, there are things we probably want to add – firstly – remove all the hard coded stuff. But what about being able to ‘update’ results based on your query?? That’d be cool – and maybe something we’ll look at in the next addin based post (on Web addins).

Using PowerBI with Neo4j

There’s an excellent post by Cédric Charlier over at his blog about hooking Neo4j into PowerBI. It’s simple to follow and get’s you up and running, but I (as a PowerBI newbie) had a couple of spots where I ran into trouble – generally with assumptions I think that are made assuming that you know how to navigate around the PowerBI interface. (I didn’t).

So, here is a simple tutorial to get us non-BI people up and running!

The Setup Steps

First – we’ve got to install PowerBI – now, I didn’t sign up for an account, but downloaded it from the PowerBI website, and installing was simple and quick.

We also need to have Neo4j running, and you can use Community or Enterprise, it matters not – and we’ll want to put the ‘Movies’ dataset in there, so run your instance, and execute:

:play Movies

 

Now we’re ready to ‘BI’!

Step 1 – Start Power BI Desktop

This is pretty obvious, but in case you need it – click on the ‘Power BI Desktop’ link in your start menu – or double click on it if you went and put it on the Desktop. Crazy days.

Step 2 – Click on ‘Get Data’

image

That way we can get data!

Step 3 – Select ‘Blank Query’

Why not ‘web’ you ask? Well as we’re going to do some copy/pasting – it’s easier from a blank query point of view.

image

Step 4 – Advanced

In the query editor window that pops up, select ‘Advanced Editor’

image

Step 5 – Get Data!

We’re going to use the same query as Cédric as you can then use this post to augment his, so in the query editor simply paste:

let
    Source = Web.Contents( "http://localhost:7474/db/data/transaction/commit",
             [
                 Content=Text.ToBinary("{
                          ""statements"" : [ {
                          ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
             }")]
             )
in
    Source

Oh noes! The same error as Cédric got – authentication. You can’t send the login details via changing the URL to be something like:

http://user:pass@localhost….

as that also fails, but you can send in the auth as a header, by adding this line:

Headers = [#"Authorization" = "Basic bmVvNGo6bmVv"],

What is this bmVvNGo6bmVv? Well, that’s the base64 encoded user/pass combo – which is a bit uh oh as you have to generate this 🙁

I’ve got two options here – LinqPad and Powershell

LinqPad

Using this bit of C# – obviously – you can write your own C# app in VS or whatever, but typically I use LinqPad for quick scripts.

var username = "neo4j";
var password = "neo";

var encoded = Encoding.ASCII.GetBytes(string.Format("{0}:{1}", username, password));
var base64 = Convert.ToBase64String(encoded);

base64.Dump();

 

Powershell

This does pretty much the same, but can obviously be run in a Powershell prompt – which is nice!

 

Param(
    [string]$username,
    [string]$password
)

$encoder = [system.Text.Encoding]::UTF8
$token = $username + ":" + $password
$encoded = $encoder.GetBytes($token)

$base64 = [System.Convert]::ToBase64String($encoded)
Write-Output $base64

which is then used like:

GetAuthCode.ps1 –username neo4j –password neo

So, with this information, our new ‘Get data’ bit looks like:

let
    Source = Web.Contents( "http://localhost:7474/db/data/transaction/commit",
             [
                 Headers = [#"Authorization" = "Basic bmVvNGo6bmVv"],
                 Content=Text.ToBinary("{
                          ""statements"" : [ {
                          ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
             }")]
             )
in
    Source

which when we ‘preview’ gives us this:

image

Step 6 – Read as Json

Select the ‘localhost’ file and then choose ‘open as Json’ from the top menu:

image

You’ll notice once you’ve done this – your ‘Source’ has changed to now be ‘Json.Document(Web.Contents…)’

image

Step 7 – Navigation

First click on the ‘List’ of ‘Results.

This will take you to a screen that looks like this:

image

Note, you now have another ‘Step’ in the right hand bar – by the way – if you ever ’lose’ the Settings side bar – click on ‘View’ at the top and select ‘Query Settings’ to bring it back.

Then click on the ‘Record’ link, and then the ‘List’ for data:

image

Worth noting here, we’re still in the ‘Navigation’ step

Now you should have a list of ‘Record’s –

image

Step 8 – Table-ify

Go ahead and press the ‘To Table’ button, and then just ‘OK’ on the dialog that pops up:

image

Step 9 – Expand the Column

Records aren’t useful to Power BI (apparently) so – we need to expand that column out and to do that we click on the ‘Expand’ button – and in our case – we only want the ‘row’, not the meta, so unselect the ‘meta’ and press OK

image

Now you should see a row of ‘List’ and an extra step in our ‘Applied Steps’ list:

image

Step 10 – Add a custom column

So now we need to get the information out of these new ‘Lists’ – and to do that we need a custom column, so click on the ‘Custom Column’ button in the ‘Add Column’ tab:

image

In the dialog that pops up we want to have it say:

= Record.FromList([Column1.row], type[Name = text, Rank = number])

image

Then press OK, and you’ll have another Column called ‘Custom’, and another item in our Applied Steps:

image

Step 11 – Expand Custom

More records eh? Let’s expand it out, so as before, click on the ‘Expand’ button:

image

and in this case, we want all the columns:

image

Now you should have two new columns, and another step added:

image

Data! Yay!

Step 12 – Remove that non-useful row

Right click on the ‘Column1.row’ column and select Remove

image

Step 13 – Close & Apply

Now we have data in a format we can use in Power BI, let’s close and apply that query.

image

Step 14 – Use that data

Now – I’m no Power BI user – so this is super simple and pointless, but should get you going for experimenting.

After applying that query we’re back in the main desktop view, but now in the right hand side – we have some fields with our Query there:

image

Let’s VISUALIZE

I’m going to pick a ‘Treemap’ – because.

image

Empty treemap – Check!

image

Let’s set some data, I want to group by ‘Rank’, so I drag ‘Custom.Rank’ to the ‘Group’ section which is in the ‘Visualizations’ bar:

image

And then for ‘Values’ I’m going to drag the ‘Custom.Name’ field

image

Oooooh – colours:

image

Let’s expand our visualization by pressing the ‘Focus Mode’ button:

image

Boom! Full size

Now, if I hover over one of those boxes I get the brief info displayed:

image

Ace, only 2 names with a rank of 5, and to see who they are, right click and select ‘See Records’

image

And here they are:

image

No More Steps

If you want to just copy/paste the code, you can! Create a new blank query and open up the advanced editor and just paste the code below in. (NB There are probably loads of things which are rubbish about this implementation, lemme know!)

let
    Source = 
        Json.Document(
            Web.Contents("http://localhost:7474/db/data/transaction/commit",
            [
                Headers=[Authorization="Basic bmVvNGo6bmVv"],
                Content=Text.ToBinary("{""statements"" : [ {
                        ""statement"" : ""MATCH (tom:Person {name:'Tom Hanks'})-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors), (coActors)-[:ACTED_IN]->(m2)<-[:ACTED_IN]-(cocoActors) WHERE NOT (tom)-[:ACTED_IN]->(m2) RETURN cocoActors.name AS Recommended, count(*) AS Strength ORDER BY Strength DESC""} ]
                        }")
            ])),
    results = Source[results],
    results1 = results{0},
    data = results1[data],
    #"Converted to Table" = Table.FromList(data, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"row"}, {"Column1.row"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Column1", "Custom", each Record.FromList([Column1.row], type[Name = text, Rank = number])),
    #"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom", "Custom", {"Name", "Rank"}, {"Custom.Name", "Custom.Rank"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Column1.row"})
in
    #"Removed Columns"

 

Writing a Stored Proc in Neo4j for .NET Developers

I’m a .NET developer and I have been for about 13 years or so now, predominantly in C#, but I originally (in my university days) started off programming in Java. Now, I’ve not touched Java for roughly 13 years, and I’m pretty happy with that situation.

3 years(ish) ago I started using Neo4j – as you might notice from previous blog posts. The ‘j’ does indeed stand for Java and I had the feeling that some day – some dark day – I would have to flex those Java muscles again.

Turns out they had atrophy-ed to nothing.

Bum.

I’m guessing that I’m not the only .NET dev out there who fears the ‘j’ – so here’s a quick get-up-and-go guide to writing a Neo4j Stored Procedure.

We’re going to write a stored procedure to return a movie and all it’s actors from the example ‘movies’ database you can add to your instance by running ‘:play movies’ in the Neo4j Console.

The Tutorial

We’re going to go through the steps to create a super simple stored proc that just get’s the actors for a given movie.

Step 1. Install the JDK

I never thought I’d have to write that. Ho Hum, so I google for ‘Java SDK’ and pick the ‘Java SE Development Kit 8’ link (top one for me). Then download the appropriate SJDK for your environment, x64 or x86. Then install that bad boy. Oooh, Java is used on X Billion devices good to know!

Step 2. Install Gradle

Just go to https://gradle.org/install and follow the instructions – if you have ‘scoop’ or ‘chocolatey’ installed, then you can use them, I went manual install.

Side note 1 – WTF is Gradle?

Gradle is a build-automation system for Java, I guess a bit like MSBuild – with Nuget built in. As we’ll see a bit later on, you add the dependencies which are then pulled from Maven (the Java Nuget (I think)) – most of the posts you see online referring to how to create stored procs for Neo4j will use a Maven setup, but APOC (a large community driven set of stored procs) uses Gradle, and I reckon if they’re using Gradle, it’s probably better than Maven. Or it’s newer and shinier. Either way – I’m going Gradle.

Step 3. Choose your IDE – IntelliJ

To be honest, if you go with anything other than IntelliJ IDEA – you may as well stop reading – as this is all written from an IntelliJ point of view. I’m using the Ultimate edition, but I have no doubt this will be pretty much the same on the Community (free!) version.

Download and install.

Step 4. Start IntelliJ

image

It does have a nicer splash screen than Visual Studio – and JetBrains write Resharper – so hopefully the changeover isn’t as jarring (ha!) as it could be.

Step 5. New Project!

image

But what new project?

image

What we’re going to go for is a ‘Gradle’ project, choosing Java and using the 1.8 SDK:

image

Step 6. GroupId and ArtifactId

Pressing ‘next’ gets us to a window allowing us to set the groupId and artifactId of our project.

Step 7…

Wait what? GroupId? ArtifactId? What on earth are they??? Shouldn’t there just be ‘Name’?

OK, you can think of these as kind of like a namespace and a dll (jar) name.

GroupId – a name to uniquely identify your project across all projects. Typically (it seems) this usually follows the convention of ‘org.<companyName>.<projectName>’ so, (going all MS) I might have: ‘org.contoso.movies’.

ArtifactId – Basically the name of the JAR file that is created, minus any versioning information. Lowercase only folks – cos it’s Java, and I guess to optimise keyboard usage they opted to shun the shift key.

image

As you can see, I’ve got a company name of ‘cskardon’ and a JAR name of ‘movies-procs’. I’ve left the Version as it was. Just because. Hit Next, next.

Step 7. More settings!

Don’t worry we’re nearly there,

I turned on ‘Auto-import’ and ‘Create directories for empty content roots automatically’. I’m using the default gradle wrapper – this basically (as far as I know) puts a copy of gradle into your folder so you can run ‘gradle.bat’ from the command prompt and have it do all the things. Either way, it does mean you don’t have to install gradle if you’re just using the code.

You will need to make sure the Gradle JVM is set to ‘1.8’ (see the picture below) it won’t work with the JAVA_HOME option.

image

Step 8. Locations!

Finally! Locations! Name wise – we’ll stick with what we selected for the artifactId in step 6, this makes life easier – and location wise – go for wherever you like – it’s your computer after all.

image

Note, we now have a ‘finish’ button – no more ‘next’ HUZZAH!

Step 9. Expand and config files

First off, let’s expand the ‘movies-procs’ node:

image

Now, double click on the ‘build.gradle’ file. We need to add some things here to get access to libraries for Neo4j. First up is a ‘project.ext’ element:

project.ext {
    neo4jVersion = "3.2.0"
}

This needs to be below the sourceCompatibility element, and above the repositories element. Speaking of which, we need some more repositories, so set the repositories element to:

repositories {
mavenLocal()
maven { url "https://m2.neo4j.org/content/repositories/snapshots" }
mavenCentral()
maven { url "http://oss.sonatype.org/content/repositories/snapshots/" }
}

Now we need to change the dependencies so we can use all the goodies.

dependencies {
compile group: 'commons-codec', name: 'commons-codec', version:'1.9'

compile 'com.jayway.jsonpath:json-path:2.2.0'

compileOnly group: 'net.biville.florent', name: 'neo4j-sproc-compiler', version:'1.2'

testCompile group: 'junit', name: 'junit', version:'4.12'

testCompile group: 'org.hamcrest', name: 'hamcrest-library', version:'1.3'

testCompile group: 'org.apache.derby', name: 'derby', version:'10.12.1.1'

testCompile group: 'org.neo4j', name: 'neo4j-enterprise', version:neo4jVersion

testCompile group: 'org.neo4j', name: 'neo4j-kernel', version:neo4jVersion, classifier: "tests"
testCompile group: 'org.neo4j', name: 'neo4j-io', version:neo4jVersion, classifier: "tests"
compileOnly(group: 'org.neo4j', name: 'neo4j', version:neo4jVersion)
compileOnly(group: 'org.neo4j', name: 'neo4j-enterprise', version:neo4jVersion)

compileOnly(group: 'org.codehaus.jackson', name: 'jackson-mapper-asl', version:'1.9.7')
testCompile(group: 'org.codehaus.jackson', name: 'jackson-mapper-asl', version:'1.9.7')

compileOnly(group: 'org.ow2.asm', name: 'asm', version:'5.0.2')

compile group: 'com.github.javafaker', name: 'javafaker', version:'0.10'

compile group: 'org.apache.commons', name: 'commons-math3', version: '3.6.1'
}

By the way – I’d like to point out I have largely got this from the APOC library, so it’s probably bringing in too much, and is probably overkill, but later on when you need something obscure, it’s probably already there. So… Win!

Step 10. Package 1

10 steps to get to programming, but on the plus side – each stored proc you add to this project doesn’t need the setup, and it’s a one-off for each project. Anyhews.

So we’re going to add a package, which is a namespace. In this case we’re going to add one called ‘common’:

Expand the ‘src/main’ folders – and right click on the ‘java’ folder, then add –> new –> Package

image

Now you have the package there:

image

Step 11. A class

Now time to add a Java file – called ‘MapResult’ – this is entirely taken from APOC.

image

Type is in this case a class:

image

Highlight everything in the class that is created, and paste the below into it:

package common;

import java.util.Collections;

import java.util.Map;

public class MapResult {

private static final MapResult EMPTY = new MapResult(Collections.<String, Object>emptyMap());
public final Map<String, Object> value;

public static MapResult empty() {

return EMPTY;
}

public MapResult(Map<String, Object> value) {

this.value = value;
}
}

This allows us to map this result of our query.

Step 12. Package 2

OK, now we’re going to add another package to the ‘java’ folder, this time called ‘movie’, and a class within that called ‘ActorProcedures’ – not necessarily the best named class :/

image

Step 13. Code!

I’m just going to ask you to paste the below into your code window, and we’ll go over it in a minute or two:

package movie;

import common.MapResult;
import org.neo4j.procedure.Context;
import org.neo4j.procedure.Name;
import org.neo4j.procedure.Procedure;

import java.util.Collection;
import java.util.HashMap;
import java.util.Map;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import static java.lang.String.format;
import static java.lang.String.join;

public class ActorProcedures {
@Context
public org.neo4j.graphdb.GraphDatabaseService _db;

public static String withParamMapping(String fragment, Collection keys) {
if (keys.isEmpty()) return fragment;
String declaration = " WITH " + join(", ", keys.stream().map(s -&gt; format(" {`%s`} as `%s` ", s, s)).collect(Collectors.toList()));
return declaration + fragment;
}

@Procedure
public Stream getActors(@Name("title") String title) {

Map&lt;String, Object&gt; params = new HashMap&lt;String, Object&gt;();
params.put("titleParam", title);
return _db.execute(withParamMapping("MATCH (m:Movie)&lt;-[:ACTED_IN]-(a:Person) WHERE m.title = {titleParam} RETURN a", params.keySet()), params).stream().map(MapResult::new);
}
}

Step 14. Build

Yes – I know we’ve eschewed tests, we’ll come to those later, for now we just want to do the standard ‘build’ – because we’re using gradle. we’re going use IntelliJ to help us run the build, first go to ‘View’ then ‘Tool Windows’ and select ‘Gradle’

image

In the window the pops up, expand the ‘Tasks’ and then ‘build’ collapsed elements, and double click on ‘build’:

image

You should get a ‘Run’ window popping up at the bottom of the screen, looking a bit like this:

image

You should also now have a ‘build’ folder in your project window, with a ‘libs’ folder inside, and hopefully inside that – the .jar file

image

Step 15. Manual Testing

I’m going to cover unit testing the procedure in another post, to try to limit the size of this one, but obviously now we have a .jar, we want to put that into our DB.

Right-click on the .jar and select ‘Show in Explorer’

image

Copy that .JAR file and place into the ‘plugins’ directory of your version of Neo4j. Now, if you’ve used the ‘zip’ version – it’s just in the root already there, and if you’re using the installer version – you’ll need to create a ‘plugins’ folder in the location listed in the application:

image

So copy the location and open it in explorer:

image

New folder called plugins.

Now paste the .Jar file into the plugins folder and stop (if you need to) Neo4j, then start it again to load it.

Go to the neo4j browser and login if you need to.

We’re now going to run ‘call dbms.procedures’

image

We get a list of the procs in the DB, so far so good – now it’s time to scroll on down the list…

image

Awesomeballs!

Now, let’s call that bad boy. BTW – I’m assuming you have the movies DB installed – if not, run

:play movies

now and get it all there. Done? Good.

To call our proc, we run:

call movie.getActors(“Top Gun”)

image

Which gets us results:

image

Now, that seems tested and working. But we probably want to start getting some unit tests in there asap, so I’ll cover that next.

So you want to go Causal Neo4j in Azure? Sure we can do that

So you might have noticed in the Azure market place you can install an HA instance of Neo4j – Awesomeballs! But what about if you want a Causal cluster?

image

Hello Manual Operation!

Let’s start with a clean slate, typically in Azure you’ve probably got a dashboard stuffed full of other things, which can be distracting, so let’s create a new dashboard:

image

Give it a natty name:

image

Save and you now have an empty dashboard. Onwards!

To create our cluster, we’re gonna need 3 (count ‘em) 3 machines, the bare minimum for a cluster. So let’s fire up one, I’m creating a new Windows Server 2016 Datacenter machine. NB. I could be using Linux, but today I’ve gone Windows, and I’ll probably have a play with docker on them in a subsequent post…I digress.

image

At the bottom of the ‘new’ window, you’ll see a ‘deployment model’ option – choose ‘Resource Manager’

image

Then press ‘Create’ and start to fill in the basics!

image

  • Name: Important to remember what it is, I’ve optimistically gone with 01, allowing me to expand all the way up to 99 before I rue the day I didn’t choose 001.
  • User name: Important to remember how to login!
  • Resource group: I’m creating a new resource group, if you have an existing one you want to use, then go for it, but this gives me a good way to ensure all my Neo4j cluster resources are in one place.

Next, we’ve got to pick our size – I’m going with DS1_V2 (catchy) as it’s pretty much the cheapest, and well – I’m all about being cheap.

image

You should choose something appropriate for your needs, obvs. On to settings… which is the bulk of our workload.

image

I’m creating a new Virtual Network (VNet) and I’ve set the CIDR to the lowest I’m allowed to on Azure (10.0.0.0/29) which gives me 8 internal IP addresses – I only need 3, so… waste.

image

I’m leaving the public IP as it is, no need to change that, but I am changing the Network Security Group (NSG) as I intend on using the same one for each of my machines, and so having ‘01’ on the end (as is default) offends me Smile

image

Feel free to rename your diagnostics storage stuff if you want. The choice as they say – is yours.

Once you get the ‘ticks’ you are good to go:

image

It even adds it to the dashboard… awesomeballs!

image

Whilst we wait, lets add a couple of things to the dashboard, well, one thing, the Resource group, so view the resource groups (menu down the side) and press the ellipsis on the correct Resource group and Pin to the Dashboard:

image

So now I have:

image

After what seems like a lifetime – you’ll have a machine all setup and ready to go – well done you!

image

Now, as it takes a little while for these machines to be provisioned, I would recommend you provision another 2 now, the important bits to remember are:

  • Use the existing resource group:
    image
  • Use the same disk storage
  • Use the same virtual network
  • Use the same Network Security Group
    image

BTW, if you don’t you’re only giving yourself more work, as you’ll have to move them all to the right place eventually, may as well do it in one!

Whilst they are doing their thing, let’s setup Neo4j on the first machine, so let’s connect to it, firstly click on the VM and then the ‘connect’ button

image

We need two things on the machine

  1. Neo4j Enterprise
  2. Java

The simplest way I’ve found (provided your interwebs is up to it) is to Copy the file on your local machine, and Right-Click Paste onto the VM desktop – and yes – I’ve found it works way better using the mouse – sorry CLI-Guy

Once there, let’s install Java:

image

Then extract Neo4j to a comfy location, let’s say, the ‘C’ drive, (whilst we’re here… !Whaaaaat!!??? image 

an ‘A’ drive? I haven’t seen one of those for at least 10 years, if not longer).

Anyways – extracted and ready to roll:

image

UH OH

image

Did you get ‘failed’ deployments on those two new VMs? I did – so I went into each one and pressed ‘Start’ and that seemed to get them back up and running.

#badtimes

(That’s right – I just hashtagged in a blog post)

Anyways, we’ve now got the 3 machines up and I’m guessing you can rinse and repeat the setting up of Java and Neo4j on the other 2 machines. Now.

To configure the cluster!

We need the internal IPs of the machines, we can run ‘IpConfig’ on each machine, or just look at the V-Net on the portal and get it all in one go:

image

So, machine number 1… open up ‘neo4j.conf’ which you’ll find in the ‘conf’ folder of Neo4j. Ugh. Notepad – seriously – it’s 2017, couldn’t there be at least a slight  improvement in notepad by now???

I’m not messing with any of the other settings, purely the clustering stuff – in real life you would probably configure it a little bit more. So I’m setting:

  • dbms.mode
    • CORE
  • causal_clustering.initial_discovery_members
    • 10.0.0.4:5000,10.0.0.5:5000;10.0.0.6:5000

I’m also uncommenting all the defaults in the ‘Causal Clustering Configuration’ section – I rarely trust defaults. I also uncomment

  • dbms.connectors.default_listen_address

So it’s contactable externally. Once the other two are setup as well we’re done right?

HA No chance! Firewalls – that’s right in plural. Each machine has one – which needs to be set to accept the ports:

5000,6000,7000,7473,7474,7687

image

Obviously, you can choose not to do the last 3 ports and be uncontactable, or indeed choose any combo of them.

Aaaand, we need to configure the NSG:

image

I have 3 new ‘inbound’ rules – 7474 (browser), 7687 (bolt), 7000 – Raft.

Right. Let’s get this cluster up and contactable.

Log on to one of your VMs and fire up PowerShell (in admin mode)

image

First we navigate to the place we installed Neo4j (in my case c:\neo4j\neo4j-enterprise-3.1.3\bin) and then we import the Neo4j-Management module. To do this you need to have your ExecutionPolicy set appropriately. Being Lazy, I have it set to ‘Bypass’ (Set-ExecutionPolicy bypass).

Next we fire up the server in ‘console’ mode – this allows us to see what’s happening, for real situations – you’re going to install it as a service.

You’ll see the below initially:

image

and it will sit like that until the other servers are booted up. So I’ll leave you to go do that now…

Done?

Good – now, we need to wait a little while for them to negotiate amongst themselves, but after a short while (let’s say 30 secs or less) you should see:

image

Congratulations! You have a cluster!

Logon to that machine via the IP it says, and you’ll see the Neo4j Browser, login and then run

:play sysinfo

image

You could now run something like:

Create (:User {Name:’Your Name’})

And then browse to the other machines to see it all nicely replicated.