Testing Neo4j.Driver (4.1.1) Part 1

There are a few challenges when dealing with the official Neo4j.Driver when it comes to testing, over a period of time, I’ve hit a few of them, and thought it would be good to share them with you.

TL;DR; This is all be available on GitHub

So, let’s write a method, in which we pass a title in, and get the Movie that the title relates to. A few assumptions:

  1. Movie exists as a class, with a Title (string), Tagline (string) and Released (int) property
  2. The method will be in a MovieStore class, which we’re not testing (we assume it’s already been tested)

Nuget wise – we’re going to be using:

I’m not going to go full TDD – I think we can skip a bit ahead, and start with a base stub that we can test:

public class MovieStore
{
    private readonly IDriver _driver;

    public MovieStore(IDriver driver)
    {
        _driver = driver;
    }

    public async Task<Movie> GetMovie(string title)
    {
        return null;
    }
}

Test 1

If I call with an invalid title, I get null back

[Fact]
public async Task Test1_ReturnsNull_WhenInvalidTitleGiven()
{
    var movieStore = new MovieStore( /* IDriver?? */ );
    var actual = await movieStore.GetMovie("invalid");

    actual.Should().BeNull();
}

Our first problem, we need to mock an IDriver instance for the MovieStore to be constructed, this is why we have Moq:

[Fact]
public async Task Test1_ReturnsNull_WhenInvalidTitleGiven()
{
    var driverMock = new Mock<IDriver>();
    var movieStore = new MovieStore( driverMock.Object );
    var actual = await movieStore.GetMovie("invalid");

    actual.Should().BeNull();
}

Running our test – and it passes, for the obvious reason that – well, we only return null.

Test 2

Jumping ahead – we might go with ‘passing in a valid title gets a Movie’. But code-wise to get to this stage there’s actually quite a lot we need to consider.

  1. We need a Session
  2. With that Session we need to run a Transaction Function (we’ll come to why a bit later on)
  3. In that function, we need to execute some Cypher
  4. We need to parse the results of that Cypher
  5. Each result from above we’ll need to parse into a Movie (in this case there really should only be 1 result)
  6. We need to return that Movie out of the function

So, I guess we ought to test that we get a Session to work with:

[Fact]
public async Task Test2_UsesTheAsyncSession_ToGetTheMovie()
{
    var driverMock = new Mock<IDriver>();
    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie("valid");

    driverMock.Verify(x => x.AsyncSession(), Times.Once);
}

A failing test!
To fix it – let’s add our code into our method:

public async Task<Movie> GetMovie(string title)
{
    _driver.AsyncSession();
    return null;
}

As a ‘be the best person’ thing – we should be disposing of that Session as well, which is where we meet our next Mock target. Closing the Session is a method of the IAsyncSession object, not the IDriver. So we need our Mock<IDriver> to supply an IAsyncSession:

[Fact]
public async Task Test2a_ClosesTheAsyncSession_ToGetTheMovie()
{
    //Our new mock
    var sessionMock = new Mock<IAsyncSession>();

    var driverMock = new Mock<IDriver>();

    //We setup the driverMock to Return the sessionMock object when anything asks for the AsyncSession
    driverMock
        .Setup(x => x.AsyncSession())
        .Returns(sessionMock.Object);

    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie("valid");

    //Now we can check the session is closed.
    sessionMock.Verify(x => x.CloseAsync(), Times.Once);
}

Our new code looks like:

public async Task<Movie> GetMovie(string title)
{
    var session = _driver.AsyncSession();
    await session.CloseAsync();
    return null;
}

Good – now run our tests, OH NOES!! the previous 2 tests have just broken, throwing NullReferenceExceptions due to the fact that the AsyncSession isn’t mocked for them.

I would heartily recommend NCrunch or Resharper to automatically run your tests as soon as possible to get feedback directly when writing tests.


Decision time!

There are a few options here, the first is to ignore the error – and for the Test2_UsesTheAsyncSession_ToGetTheMovie() this is an option. With this test we could catch the exception, which arguably is the right thing to do as we don’t actually care if it succeeds – only that we attempt to open the session.

The second option is to copy the setup code to the other 2 methods, for Test1 this is the right choice, as the result should be null if it succeeds, but there is nothing to return.

I’m no purist, and I don’t particularly want my tests covered with the same boiler plate code, so I’m going to extract my setup code into another method, and use that in place:

//We're returning 'Mock' versions so we can do verification later if we want to
private static void GetMocks(out Mock<IDriver> driver, out Mock<IAsyncSession> session)
{
    var sessionMock = new Mock<IAsyncSession>();

    var driverMock = new Mock<IDriver>();
    driverMock
        .Setup(x => x.AsyncSession())
        .Returns(sessionMock.Object);

    driver = driverMock;
    session = sessionMock;
}

This method will be changed over time I imagine, adding more mocks, maybe some more default options etc to use this in my Test2a test we do:

[Fact]
public async Task Test2a_ClosesTheAsyncSession_ToGetTheMovie()
{
    //Getting the mocks here.
    GetMocks(out var driverMock, out var sessionMock);

    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie("valid");

    sessionMock.Verify(x => x.CloseAsync(), Times.Once);
}

For the other tests (and I’ll just show Test1 here), I can use the _ to ignore the out parameters I’m not interested in

[Fact]
public async Task Test1_ReturnsNull_WhenInvalidTitleGiven()
{
    //See the use of _ here for the session mock, as I don't need it
    GetMocks(out var driverMock, out _);
    var movieStore = new MovieStore(driverMock.Object);
    var actual = await movieStore.GetMovie("invalid");

    actual.Should().BeNull();
}

Test 3

Way up above – step 2 was:

2. With that Session we need to run a Transaction Function (we’ll come to why a bit later on)

The simplest way for us to run the Cypher we want to run is just running RunAsync on the Session. The problem with this approach is when we decide to move to a Cluster, it is recommended we use Transaction Functions, as they automatically retry connections for us, and due to the way Causal Clusters work, that takes the effort away from us – the developers – which can only be a good thing!

So, the test:

[Fact]
public async Task Test3_OpensAReadTransaction()
{
    GetMocks(out var driverMock, out var sessionMock);

    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie("valid");

    sessionMock.Verify(x => x.ReadTransactionAsync(It.IsAny<Func<IAsyncTransaction, Task>>()), Times.Once);
}

Hmm what’s this It.IsAny stuff that’s suddenly appeared? Well, we only care that we use a ReadTransaction – at this stage we don’t care what is actually going on in the transaction function, merely that we’re using one.

Our method now looks like:

public async Task<Movie> GetMovie(string title)
{
    var session = _driver.AsyncSession();
    await session.ReadTransactionAsync(tx => { return null; });

    await session.CloseAsync();
    return null;
}

Test 4

It’s executing cypher time!

So, quick recap. We have our code opening a Session, and in that Session we’re opening a ReadTransaction function. All good – now, we need to do something in our transaction.

Let’s first work out our Cypher. We want a Movie by title so:

MATCH (m:Movie) WHERE m.title = $title RETURN m

I’m using a parameter $title for the title, as that makes Neo4j run better for subsequent queries.
So, we want to call that in our transaction, how do we do that?

public async Task<Movie> GetMovie(string title)
{
    //I usually put this at the top, to make it obvious.
    //For big queries - I'll use '@' to multiline it to make it more readble.
    const string query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";

    var session = _driver.AsyncSession();
    //RunAsync is async, so we pass in the 'tx' as 'async'
    await session.ReadTransactionAsync(async tx =>
    {
        //We await a call to 'RunAsync'
        await tx.RunAsync(query, new {title});
    });

    await session.CloseAsync();
    return null;
}

At this point, all I’m doing is executing some Cypher, I’m not dealing with the results, all I want to check is that the cypher is correct.

This is a contrived example, given it’s a const at the top of the method, would I really bother testing the Cypher is correct? Maybe not, but for a query which is composed, I might want to. It’s also useful to be able to test that the title is being passed in as a parameter rather than a string. Which is the kind of thing an overly optimistic developer can do by accident. Also – this is a way to show how you would test this.

In a break from tradition, I’ve shown you the code before the test, and that’s because it’s easier to show how to test it if you can see the code.

We’ll need to Mock the tx part of this, to be able to check what is called. tx in this case is an IAsyncTransaction – so let’s Mock that:

var transactionMock = new Mock<IAsyncTransaction>();

Now the slightly tricky bit – we need to get this mock into the actual call that is being made. This is so we can verify what is called.

We have a mock Session – so let’s Setup the call:

sessionMock
    .Setup(x => x.ReadTransactionAsync(It.IsAny<Func<IAsyncTransaction, Task>>()))
    .Returns((Func<IAsyncTransaction, Task> func) =>
    {
        func(transactionMock.Object);
        return Task.CompletedTask;
    });

Let’s take it apart a bit. First – we .Setup the call – we don’t care what the actual call is, only that it matches the pattern (Func<IAsyncTransaction, Task>), and at the moment it does.

Next we .Returns a Task.CompletedTask – this maps to the Task part of the Funcbut we also have this (Func<IAsyncTransaction, Task> func) => bit – which is the most important bit. Basically we’re taking the parameter from the Setup and injecting our own IAsyncTransaction: func(transactionMock.Object);.

Because we use our Mock – we can now Verify that the code is called correctly:

[Fact]
public async Task Test4_ExecutesTheRightCypher()
{
    const string expectedCypher = "MATCH (m:Movie) WHERE m.title = $title RETURN m";
    GetMocks(out var driverMock, out _, out var transactionMock);

    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie("valid");

    transactionMock.Verify(x => x.RunAsync(expectedCypher, It.IsAny<object>()), Times.Once);
}

We’ve modified our GetMocks method to return the transactionMock as an out parameter. We’re also not testing the parameter itself. Merely that the Cypher is correct. So. Let’s test that parameter:

[Fact]
public async Task Test4a_ExecutesUsingTheRightParameter()
{
    const string expectedParameter = "valid";
    GetMocks(out var driverMock, out _, out var transactionMock);

    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie(expectedParameter);

    transactionMock.Verify(x => x.RunAsync(It.IsAny<string>(), expectedParameter), Times.Once);
}

Sorted. Oh wait. Hmmm. Doesn’t work – Ahhh, RunAsync takes an object (or the overload we’re using does), and we actually pass it in as new { title } – Anonymous type time!

We need to do some custom comparison here, for that we’ll use It.Is:

transactionMock.Verify(x => x.RunAsync(It.IsAny<string>(), It.Is<object>(o => <COMPAREHERE>)), Times.Once);

The tricky bit here is the <COMPAREHERE> bit. What we have is an anonymous type, so we can’t just do: o == expectedParameter, we need to get the value out. My first attempt was to try:

((dynamic)o).title == expectedParameter

But that didn’t work as expression trees can’t contain dynamic objects – so I learnt something 🙂 Generally, when I’ve got to this point, it’s a method we’re after. We have need 3 parameters, the first is the actual object o itself. Next, we need the expectedValue – and because we’re pulling from an AnonymousObject we’ll need to know the name of the property we’re looking for. This last is important, as we could pass in more than one parameter (new {title, title1 = title, title2 = title});

Luckily, in another project I have one of these for just such an occassion – who’d have thought?!

private static bool CompareParameters<T>(object o, T expectedValue, string propertyName) 
{
    var actualValue = (T) o.GetType().GetProperty(propertyName)?.GetValue(o);
    actualValue.Should().Be(expectedValue);
    return true;
}

We use reflection to get the value from the o and then cast (T) it to T – then call actualValue.Should().Be(expectedValue) – which will throw an exception if it’s not. Finally, we return true; and we do this as It.Is needs a bool response, and if no exceptions have been thrown, then we’re all good.

Putting that into the codebase, our call now looks like:

[Fact]
public async Task Test4a_ExecutesUsingTheRightParameter()
{
    const string expectedParameter = "valid";
    GetMocks(out var driverMock, out _, out var transactionMock);

    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie(expectedParameter);

    transactionMock.Verify(x => x.RunAsync(It.IsAny<string>(), It.Is<object>(o => CompareParameters(o, expectedParameter, "title"))), Times.Once);
}

Test 5

OK, where have we got to? Step 4, parsing our results. This makes sense, at the moment, all we’re doing is running Cypher.

So the trick here is to Mock the result from the query we pass in, we do this and don’t connect to a Neo4j instance for a couple of reasons, the first is that I don’t want to test that Neo4j works. I think at this point, it’s safe to assume that executing a query will work. I don’t know that the query is right – and that would require integration testing – but at this stage, I want to know that assuming the connection is ok, and the query has run – and we get back valid data that I can parse that. The other reason is that I don’t have to write a load of setup code, and probably some powershell scripts to bring down an instance of Neo4j, start it, etc. At least not in this post!

So what does the data from Neo4j look like?

When I’m in this sort of situation, particularly if I’m investigating what the outcome of a query will look like, I go to my trusted friend, LinqPad, running the following query:

var driver = GraphDatabase.Driver("neo4j://localhost:7687", AuthTokens.Basic("neo4j","neo"));
var session = driver.AsyncSession();

var cursor = await session.RunAsync("MATCH (m:Movie) WHERE m.title = 'The Matrix' RETURN m");
await cursor.FetchAsync();
cursor.Dump();

Gets me:

So we get an IResultCursor with a property call Current, and inside that an IRecord, which has Keys and Values properties.

The Keys property contains my m that I’m returning from the query (RETURN m) – and I can see that, that is a Node – which for us, is an INode.

The likelihood is that the INode contains the actual data we want, so if we add:

cursor.Current["m"].As<INode>().Dump();

We get:

Bingo! All the datas. So to test this, we need to Mock the following:

  • IResultCursor – with setups for the Current["m"] property indexer, and the FetchAsync method – as we need to call that to succeed – so we ought to ensure we do call it.
  • INode – with the Properties property.

Why are we not Mocking the IRecord? Well – this is because we can bypass it and just return the INode from the Current["m"] call. If we were getting the IRecord from the Current property directly, and then accessing it, we would need to Mock it.

Now, we’re reach an interesting point. Would it be better for us to make a Stub instead of a Mock for any of these?

We could implement a TestNode : INode pretty easily, but we’d end up implementing a lot more than we need to test our code base.

I would argue for this – we should approach it from a Mock point of view, and if at somepoint it makes sense to Stub then go for that at that point. For now… Mocking…

So, the IResultCursor comes from the IAsyncTransaction, and we already have that mocked, so we can just access it and add a new Setup:

var cursorMock = new Mock<IResultCursor>();
transactionMock
    .Setup(x => x.RunAsync(It.IsAny<string>(), It.IsAny<object>()))
    .Returns(Task.FromResult(cursorMock.Object));

We don’t need to setup anything on cursorMock at the moment, as the default for a Moq mock is Loose which means it won’t throw an exception when a call is made on a method that isn’t setup.

[Fact]
public async Task Test5_CallsFetchAsyncToGetTheNextRecord()
{
    GetMocks(out var driverMock, out _, out var transactionMock);

    var cursorMock = new Mock<IResultCursor>();
    transactionMock
        .Setup(x => x.RunAsync(It.IsAny<string>(), It.IsAny<object>()))
        .Returns(Task.FromResult(cursorMock.Object));

    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie("Valid");

    cursorMock.Verify(x => x.FetchAsync(), Times.Once);
}

Which leads to our method being:

public async Task<Movie> GetMovie(string title)
{
    const string query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";

    var session = _driver.AsyncSession();
    await session.ReadTransactionAsync(async tx =>
    {
        var cursor = await tx.RunAsync(query, new {title});
        await cursor.FetchAsync();
    });

    await session.CloseAsync();
    return null;
}

So, next step, let’s make the IResultCursor return valid data, begin with an INode:

const string expectedTitle = "Title";
const string expectedTagline = "Tagline";
const int expectedReleaseDate = 2000;

var nodeMock = new Mock<INode>();
nodeMock.Setup(x => x.Properties["title"]).Returns(expectedTitle);
nodeMock.Setup(x => x.Properties["tagline"]).Returns(expectedTagline);
nodeMock.Setup(x => x.Properties["released"]).Returns(expectedReleaseDate);

I suspect we’ll be extracting that out into a method, in fact, let’s just do that:

private static Mock<INode> GetMockNode(string title = "Title", string tagline = "Tagline", int? released = 2000)
{
    var nodeMock = new Mock<INode>();
    nodeMock.Setup(x => x.Properties["title"]).Returns(title);
    nodeMock.Setup(x => x.Properties["tagline"]).Returns(tagline);
    nodeMock.Setup(x => x.Properties["released"]).Returns(released);
    return nodeMock;
}

I’ve also removed the consts as in this case, I’m not actually checking that it’s doing it right, merely that it is attempting to get the values.

We need to return our node mock now:

var nodeMock = GetMockNode();
var cursorMock = new Mock<IResultCursor>();
cursorMock.Setup(x => x.Current["m"]).Returns(nodeMock.Object);

OK, and lastly, verify that the node is called on:

nodeMock.Verify(x => x.Properties["title"], Times.Once);
nodeMock.Verify(x => x.Properties["tagline"], Times.Once);
nodeMock.Verify(x => x.Properties["released"], Times.Once);

Now, the above code is the reason I have mocked each indexer as opposed to just making Properties return a new Dictionary<string, object> initialized with the values. Using a Dictionary will work but it restricts our verifiability to only checking if the Properties property was hit, not what specifically was hit. So all I could do was:

nodeMock.Verify(x => x.Properties, Times.Exactly(3));

But, calling node.Properties["title"] 3 times, would still pass, and that’s not right. That’s wrong.

Test 6

  1. Each result from above we’ll need to parse into a Movie (in this case there really should only be 1 result)

Hmmm, 2 things here, first, convert to a Movie – I actually did that in the last test (tbh – I know how this is going to come out and I’m cheating a bit as I suspect this is already quite long!). The other thing is the words ‘Each result’.

Each result

There could be more than 1.

This is bad news. Our method only returns Movie not IEnumerable<Movie> – and we’re certainly not parsing more results. Now we have two choices. Change the method to return IEnumerable, or leave it, and assume our data is clean and normalized, and there can be only one movie with a title (given that the movie industry appears to work primarily with remakes at the moment, this seems unlikely.)

We’re going to change the signature, only so we can show how to mock the multiple items scenario. Because, 90% of the time, you’ll probably end up doing that, and well, we may as well cover it. If you’re not doing that – then – you can still learn.

So. Signature change. Task<Movie> –> Task<IEnumerable<Movie>> – which due to the way we’ve currently written tests (i.e. not checking results etc) means we’re actually all good with the change.

So, let’s setup our mock of the IResultCursor to return 1 movie with the first FetchAsync and false for the second FetchAsync.

[Fact]
public async Task Test6_CallsFetchAsyncUntilFalseReturned()
{
    GetMocks(out var driverMock, out _, out var transactionMock);

    var nodeMock = GetMockNode();
    var cursorMock = new Mock<IResultCursor>();
    cursorMock.Setup(x => x.Current["m"])
        .Returns(nodeMock.Object);

    cursorMock
        .SetupSequence(x => x.FetchAsync())
        .Returns(Task.FromResult(true))
        .Returns(Task.FromResult(false));

    transactionMock
        .Setup(x => x.RunAsync(It.IsAny<string>(), It.IsAny<object>()))
        .Returns(Task.FromResult(cursorMock.Object));

    var movieStore = new MovieStore(driverMock.Object);
    await movieStore.GetMovie("Valid");

    cursorMock.Verify(x => x.FetchAsync(), Times.Exactly(2));
}

Moq has a SetupSequence method allowing us to configure the responses, in this case true then false. We then check we call FetchAsync() twice.

Now. When we change the code to reflect this, we’re probably going to break one of our other tests as we don’t have the FetchAsync setup to return anything.

OK, it was only Test5a_AttemptsToGetTheData() we needed to fix, due to the fact that the default response from FetchAsync will be false so – we never go into our loop to get data:

var cursor = await tx.RunAsync(query, new {title});
//Assign the result to a variable
var fetched = await cursor.FetchAsync();

//While that's 'true'
while (fetched)
{
    /* All the node reading code here */

    //Then see if we have another one to get
    fetched = await cursor.FetchAsync();
}

Right, finally, we’re going to get the actual Movie instances…

Test 7

Eh?! What about the rest of the previous step? As we need to return the Movie to be able to test it, that fits into our:

  1. We need to return that Movie out of the function

And this is in no way because this post has gotten longer than I imagined.

Unfortunately, the changes we need to do here will break a lot of our tests, so far, we’ve been returning nothing from our ReadTransactionAsync method, which means the signature has been: Func<IAsyncTransaction, Task>, but now we’re changing it to Func<IAsyncTransaction Task<IEnumerable<Movie>>> and that means our Mocks for the IAsyncSession need to be updated.

Fortunately, as we’d extracted that mock setup to one method, we can change it there and fix all the broken tests in one go.

Or so I thought

Uh oh! The mock setup did this:

sessionMock
    .Setup(x => x.ReadTransactionAsync(It.IsAny<Func<IAsyncTransaction, Task>>()))
    .Returns((Func<IAsyncTransaction, Task> func) =>
    {
        func(transactionMock.Object);
        return Task.CompletedTask;
    });

To change it to actually return the Func we call, we need to go with:

sessionMock
    .Setup(x => x.ReadTransactionAsync(It.IsAny<Func<IAsyncTransaction, Task<List<Movie>>>>()))
    .Returns((Func<IAsyncTransaction, Task<List<Movie>>> func) =>
    {
        return func(transactionMock.Object);
    });

At which point we discover, that the first tests start to fail as they call on things which are not there (IResultCursor) – so we should add those into our default mock setups.

We add:

var cursorMock = new Mock<IResultCursor>();
transactionMock
    .Setup(x => x.RunAsync(It.IsAny<string>(), It.IsAny<object>()))
    .Returns(Task.FromResult(cursorMock.Object));

to the GetMocks method – this allows us to get most of the tests working, the other broken ones are where (as in Test3_OpensAReadTransaction) we’re testing that sessionMock.Verify(x => x.ReadTransactionAsync(It.IsAny<Func<IAsyncTransaction, Task>>()), Times.Once); is called, but this should be Func<IAsyncTransaction, Task<List<Movie>>>.

Why List<Movie> and not IEnumerable<Movie>? Well, because we put our results into a List<Movie> and return that, and yes, I could call AsEnumerable but, meh, I can live with a proper List.

[Fact]
public async Task Test7_ReturnsTheMovie()
{
    GetMocks(out var driverMock, out _, out var transactionMock, out _);

    const string expectedTitle = "Foo";
    const string expectedTagline = "Bar";
    const int expectedReleased = 1900;
    var nodeMock = GetMockNode(expectedTitle, expectedTagline, expectedReleased);

    /* Same setup as for Test6 - removed to make the post a bit smaller */

    var movies = (await movieStore.GetMovie("Valid")).ToList();

    movies.Should().HaveCount(1);
    var movie = movies.First();
    movie.Title.Should().Be(expectedTitle);
    movie.Tagline.Should().Be(expectedTagline);
    movie.Released.Should().Be(expectedReleased);
}

First off, we use the same basic setup as for Test6_CallsFetchAsyncUntilFalseReturned, and this is because we’re doing roughly the same thing.

I’m using the consts to allow me to confirm the values are what I’m saying they should be.

Here we come to a slightly odd question – if the answers I got back were wrong, say, for example, Title came back as "Title" – is it the code I’m testing that is wrong, or is it the Mock? Both places could give the error.

Who tests the tests?

Anyhews, our Test7 above will fail at the moment as we don’t return the actual values, we’ve changed our ReadTransactionAsync code to be like:

await session.ReadTransactionAsync(async tx =>
{
    var cursor = await tx.RunAsync(query, new {title});
    var fetched = await cursor.FetchAsync();

    //What we're outputting
    var output = new List<Movie>();
    while (fetched)
    {
        /* Movie extraction code */

        //Add it to the output
        output.Add(movie);
        fetched = await cursor.FetchAsync();
    }

    //Return that output!
    return output;
});

But the containing method doesn’t return it:

public async Task<IEnumerable<Movie>> GetMovie(string title)
{
    const string query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";

    var session = _driver.AsyncSession();
    await session.ReadTransactionAsync(async tx =>
    {
        /* ReadTransactionCode */
    });

    await session.CloseAsync();
    return null;
}

All we need to do is capture the output, and return it:

public async Task<IEnumerable<Movie>> GetMovie(string title)
{
    const string query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";

    var session = _driver.AsyncSession();

    //Capture the output
    var results = await session.ReadTransactionAsync(async tx =>
    {
        /* ReadTransactionCode */
    });

    await session.CloseAsync();

    //Return the output
    return results;
}

Changing the code to do this, fixes that test, but breaks our very first one – that we return null when there are no movies that match the title.

Presently – it returns an Empty collection.

Is this correct? Should it be null or Empty? As we’re now returning IEnumerable – I prefer to return an Empty collection – this is because I might well end up doing something like:

foreach(var movie in await store.GetMovies("title")) { /* CODE */ }

I don’t want to have to check for null.

So, let’s change that test from:

[Fact]
public async Task Test1_ReturnsNull_WhenInvalidTitleGiven()
{
    GetMocks(out var driverMock, out _, out _, out _);
    var movieStore = new MovieStore(driverMock.Object);
    var actual = await movieStore.GetMovie("invalid");

    actual.Should().BeNull();
}

to:

[Fact]
public async Task Test1_ReturnsEmptyCollection_WhenInvalidTitleGiven()
{
    GetMocks(out var driverMock, out _, out _, out _);
    var movieStore = new MovieStore(driverMock.Object);
    var actual = await movieStore.GetMovie("invalid");

    actual.Should().BeEmpty();
}

Now all our tests pass.


I’m going to stop here for the moment as I think it’s probably long enough, and we cover quite a lot of the Driver.

I’ll do another (shorter) post going into more detail about the IAsyncSession mocking techniques, as there are some complications with the SessionConfigBuilder that we need to address.

Depending on when you read this – there may already be the code in the repo. 🙂

Using Neo4j.Driver? Now you can EXTEND it!

Some Code

Hot on the heels of Neo4jClient 4.0.0, I was doing some work with the Neo4j.Driver (the official client for Neo4j), and in doing so, I realised I was writing a lot of boiler plate code.

So I started adding extension methods to help me, and as my extension methods became more involved, I moved them to another project, and then… well… decided to release them!

TL;DR; you can get the Neo4j.Driver.Extensions package on Nuget, and the GitHub page is here.


The Problem

Let’s first look at the problem. Neo4j.Driver is quite verbose, you end up having lots of ‘magic strings’ throughout the codebase, which can lead to problems in runtime, and one of the reasons we’re using .NET is to try to avoid runtime errors when we can get compilation errors.

Let’s take a look at a ‘standard’ read query. Here we’re executing the following Cypher to get a movie with a given title.

MATCH (m:Movie)
WHERE m.title = $title
RETURN m

We MATCH a Movie based on it’s title and return it, easy. Code wise we have this:

public async Task<Movie> GetMovieByTitle(string title)
{
    var session = _driver.AsyncSession();
    var results = await session.ReadTransactionAsync(async tx =>
    {
        var query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";
        var cursor = await tx.RunAsync(query, new {title});
        var fetched = await cursor.FetchAsync();

        while (fetched)
        {
            var node = cursor.Current["m"].As<INode>();
            var movie = new Movie
            {
                Title = node.Properties["title"].As<string>(),
                Tagline = node.Properties["tagline"].As<string>(),
                Released = node.Properties["released"].As<int?>()
            };
            return movie;
        }

        return null;
    });

    return results;
}

We’re using transactional functions to give us future proofing should we decide to connect to a Cluster, as the function will retry the query if the current cluster member we’re connected to takes the unfortunate decision to ‘move on’.

Let’s take a closer look at the creation of the Movie object:

var node = cursor.Current["m"].As<INode>();
var movie = new Movie
{
    Title = node.Properties["title"].As<string>(),
    Tagline = node.Properties["tagline"].As<string>(),
    Released = node.Properties["released"].As<int?>()
};
return movie;

In our first line, we pull the Current IRecord from the IResultCursor, by an identifier, and attempt to get it as an INode. This is ok, I know in my query that’s what I’ve asked for (RETURN m). Then I proceed to go through the properties in my Movie – assigning the properties from the node into the right place.

For the string properties (title and tagline) it’s just an As<string>() call – which works fine, as if the property isn’t there, we just get null anyway. The released property is more complex, to make the code simpler – I’ve used .As<int?>() (nullable int) as I happen to know the default movies database does have some nodes without the released property – as we’re schema-free here.

What I could have done would be:

var released = node.Properties["released"].As<int?>();
if(released.HasValue)
    Released = released.Value;
else { /* ?!?!? */ }

Which would make my Movie class slightly tighter – but I guess, if my data can have nodes without the property – then my models should too… 🙂

GetValue

Aaaaanyways. The first step in the extension methods was to create a ‘GetValue’ method:

var node = cursor.Current["m"].As<INode>();
var movie = new Movie
{
    Title = node.GetValue<string>("title"),
    Tagline = node.GetValue<string>("tagline"),
    Released = node.GetValue<int?>("released")
};
return movie;

This method will return default if the Properties property doesn’t contain a given key, otherwise it will try to call As<T> with the associated Exceptions (FormatException and InvalidCastException) that can take place.

I mean. If that was it, you’d be right to say that this was a waste of time. But, it does simplify the call a bit… onwards!

GetContent (NetStandard 2.1 only)

I’d like it to be a bit simpler, so let’s try to sort out the while loop. Remember we have this:

var cursor = await tx.RunAsync(query, new {title});
var fetched = await cursor.FetchAsync();

while (fetched)
{
    var node = cursor.Current["m"].As<INode>();
    /* object creation code here */
}

We FetchAsync from the cursor, then while that returns true we parse our Current into INode and then create our obj.

Instead of the Fetch/While loop, we can do this instead:

var cursor = await tx.RunAsync(query, new {title});
await foreach(var node in cursor.GetContent<INode>("m")) 
{
    /* object creation code here */
}

This removes the cursor.Current["m"].As<INode>() line, and simplifies the while into a pleasing foreach.

NB. This is NetStandard 2.1 as it uses the IAsyncEnumerable interface which isn’t in NetStandard 2.0

ToObject

OK, so we’re starting to look a bit better, some of the boiler plate is going, is there anything else we can do?

OF COURSE!

Instead of all the object creation, requiring you to go through the properties (and what if you add one later in development and forget to update this code!?!) – we can use ToObject

This works on an INode (and we’ll see later other things), and allows you to pass in a Type as a generic parameter, which will be parsed, and returned to you filled if possible:

var cursor = await tx.RunAsync(query, new {title});

await foreach (var node in cursor.GetContent<INode>("m")) 
    return node.ToObject<Movie>();

Neo4jProperty

We should probably pause here to talk about Neo4jPropertyAttribute so far, we’ve had the properties from Neo4j all Lowercase, but the observant of you will have noticed that the Movie class seems to have Upper camel case naming conventions as .NET typically does.

When we’re doing the GetValue approach – not such an issue – as we define the identifier ourselves (GetValue('title')). I think it’s probably pretty obvious that I’m going to be using Reflection here to work out what property to put where but OH NOES my properties are all Upper camel case, but the data is all Lower camel case. WHAT TO DO?

This is how a property is normally defined:

public string Title {get;set;}

Basic stuff. But we can add the Neo4jProperty attribute:

[Neo4jProperty(Name = "title")]
public string Title {get;set;}

And lo and behold, the properties will be reflected properly!
As an added extra – you can also tell the serialization process to Ignore a property if you want:

[Neo4jProperty(Ignore = true)]
public string Title {get;set;}

So the full Movie class looks like:

public class Movie
{
    [Neo4jProperty(Name = "title")]
    public string Title { get; set; }

    [Neo4jProperty(Name = "released")]
    public int? Released { get; set; }

    [Neo4jProperty(Name = "tagline")]
    public string Tagline { get; set; }
}

GetRecords with ToObject (NetStandard 2.1 only)

So – we’ve seen it works on INode, but, what if we want to return a different thing than just a node, what about, the properties? So change our Query to:

var query = "MATCH (m:Movie) WHERE m.title = $title RETURN m.title AS title, m.tagline AS tagline, m.released AS released"

Well. No worries! We’ve still got a working prospect:

await foreach (var record in cursor.GetRecords())
    return record.ToObject<Movie>();

Here, I’m using the GetRecords extension method to get each IRecord and attempt to cast it to a Movie. This works as the properties of Movie match the names of the aliases in the Cypher.

RunReadTransactionForObjects

In the previous examples, for simplification, I’ve not shown the code around the outside, but – if we take the most recent one (GetRecords example), it actually looks like this:

var query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";
var session = _driver.AsyncSession();
var results = await session.ReadTransactionAsync(async x =>
{
    var cursor = await x.RunAsync(query, new {title});

    await foreach (var node in cursor.GetContent<INode>("m")) 
        return node.ToObject<Movie>();

    return null;
});

return results;

And we’ll do that for almost all the queries we’re going to run, so let’s look at how we can reduce that code as well…

var query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";
var session = _driver.AsyncSession();

var movie = 
    (await session.RunReadTransactionForObjects<Movie>(query, new {title}, "m"))
    .Single();
return movie;

I’ve added some newlines to make it a bit more readable, but now, we take the session and call RunReadTransactionForObjects<T> on it, to return the results of the query as T (in this case Movie).

The RunReadTransactionForObjects<T> method is returning an IEnumerable<T> hence I can use .Single() (or indeed any of the LINQ methods).

Put it all together

There are other extension methods in there, and some which are no doubt missing (PR away!) but I’m quite pleased that I can get from this code:

public async Task<Movie> GetMovieByTitle(string title)
{
    var query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";
    var session = _driver.AsyncSession();
    var results = await session.ReadTransactionAsync(async tx =>
    {        
        var cursor = await tx.RunAsync(query, new {title});
        var fetched = await cursor.FetchAsync();

        while (fetched)
        {
            var node = cursor.Current["m"].As<INode>();
            var movie = new Movie
            {
                Title = node.Properties["title"].As<string>(),
                Tagline = node.Properties["tagline"].As<string>(),
                Released = node.Properties["released"].As<int?>()
            };
            return movie;
        }

        return null;
    });

    return results;
}

To:

public async Task<Movie> GetMovieByTitle(string title)
{
    var query = "MATCH (m:Movie) WHERE m.title = $title RETURN m";
    var session = _driver.AsyncSession();

    var results = 
        (await session.RunReadTransactionForObjects<Movie>(query, new {title}, "m"))
        .Single();
    return results;
}

PHEW!

Long post eh?

If you got this far – well done! Please, try it, log bug reports (on GitHub– comments on here are easy to miss).

Neo4jClient 4.0

After what probably seems like ages to you (and indeed me) Neo4jClient 4.0 has finally left the pre-release stages and is now in a stable release.

Being a major version change, that means there are breaking changes, and you should be in the process of testing stuff before you just use it. Having said that, the changes which are there hopefully make sense, and will make it better in the long run.

It should be noted, that some of the things done are only applicable to Neo4j 4.x servers, and if you’re not using .NET Core, or not using Transactions – staying with the 3.2.x release of the client will be fine for now.

OK. Onto the changes, we have 4 broad categories, Breaking changes, General changes, Additions and Removals. I’ll try to demo as many of them as possible with code, but for the self-explanatory ones – well – I probably won’t 😮

Breaking

These are ones which will require you to do some code changes, how much will vary depending on your codebase.

Async Only

There are no ‘Sync’ methods anymore, decision wise – .NET code has been increasingly moving towards Async, and the client has always supported it. If there is enough clamour – I will look into re-adding the Sync wrappers, but at the moment, less-code = less to maintain.

No MSDTC (TransactionScope) Support

The 3.x versions of the client used TransactionScope to provide transaction support – which had a nice benefit of being able to support MSDTC. But, as I’m sure many know, it also prevented .NET Core support for transactions due to the availability of TransactionScope and the supporting classes in Core. With .NetStandard 2.0 TransactionScope was added, but the supporting classes not, and whilst now they are available in NetStandard 2.1 – I don’t want to push the minimum requirements for the Client to NetStandard 2.1. As a consequence, the decision has been made to remove support for TransactionScope.

Long term – the aim is to have an ITransactionManager that you can inject into the Client to allow you to provide a home rolled TransactionScope manager if you really want it. This doesn’t exist – and won’t for a while as because as far as I’m aware, there was only one company using it, and even they said they were moving away from it.

Neo4j Server 3.5+ only

Neo4j don’t support server versions lower than 3.5, and whilst the GraphClient should work with any of the 3.x servers (which support transactions), the BoltGraphClient will only work back to 3.5.

Personally – I’m not to worried about this, aside from Transactions all of the additions to the client wouldn’t work on older server versions anyhow, as they didn’t exist. Basically – if you use an older version of the server – use the older client!

Other

There are other changes that will come with the ‘Removals’ section below, but I thought I’d write about them there!

General Things

So some general changes here about the client, you might find it interesting, you might not :/

NetStandard 2.0

Thanks to a PR from tobymiller1 – (https://github.com/tobymiller1) the client is back to being just one project, and targets NetStandard 2.0

URI Schemes

The client now supports all the schemes Neo4j does, so neo4j, neo4j+s, neo4j+sc and the bolt equivalents.

Transactions

I wasn’t sure if this should be Addition or Removal or … so I’ve gone with ‘General’. Transactions needed to be changed, as I noted above, and as part of the ability to target Neo4j 4.x we needed to support Multi-tenancy – that’s multi-databases within one server.

Let’s have a look at some examples:

using(var tx = gc.BeginTransaction(TransactionScopeOption.Join, null, "neo4jclient"))
{
    await gc.Cypher.Create("(n:Node {Id:1})").ExecuteWithoutResultsAsync();
    var insideResults = await gc.Cypher.Match("(n:Node)").Return(n => n.As<Node>()).ResultsAsync;
    insideResults.Dump("In the Transcation");

    await tx.RollbackAsync();
}

var outsideResults = await gc.Cypher.Match("(n:Node)").Return(n => n.As<Node>()).ResultsAsync;
outsideResults.Dump("Out of the Transcation");

Which gives out:

Let’s take the code apart a bit.

using(var tx = gc.BeginTransaction(TransactionScopeOption.Join, null, "neo4jclient"))

Transactions are IDisposable – so you ideally should be using them with a using statement, but if not, remember to dispose()! In this version, you have to supply the TransactionScopeOption and a bookmark (the null) in this case, to be able to define the database ("neo4jclient"). That’ll probably be simplified later…

Next, we’re just executing Cypher, as we’ve done plenty of times before, and as I’m using LinqPad I can call .Dump() on any output to get the results (we saw in the picture). So I can access the stuff I’ve put in the database, within the transaction.

I then .RollbackAsync() the transaction, the effects of which can be seen by the second .Dump() I execute showing nothing in the database. Also note, with the second Results query – I have to provide a WithDatabase parameter, else I would be querying the default database.

You HAVE to .CommitAsync() a transaction for it to be committed, if I didn’t have the .RollbackAsync()) call, and just let the tx be Dispose()-ed by the using statements closing, it would automatically be rolled back. In the code above, the .RollbackAsync() call is redundant.

Write transactions are on one database only, an attempt to write to another database (using WithDatabase or Use) will result in a ClientException – nb. you can use Use to read from another database in a transaction.

Additions

New stuff for you to use! After each heading will be a list of the versions of Neo4j Server that the additions will work on.

DefaultDatabase [4.x]

Multi-tenancy brings a load of new things to the Neo4j world, but how can you use them from Neo4jClient?? First off let’s talk about DefaultDatabase. This is a property of the GraphClient/BoltGraphClient itself. By default it’s set to ‘neo4j’ (which is the default the server has), but you can set it to any other database, and every query will be against that one.

var gc = new GraphClient(...){DefaultDatabase = "neo4jclient"};

.WithDatabase() [4.x]

But what if you want to run just one of your queries (or more!) against another database? That’s where WithDatabase comes in. WithDatabase is a per query setting to choose the database a query will run on:

gc.Cypher.WithDatabase("neo4jclient").Match(...)

CreateDatabase (system) [4.x]

Want to create a database? Sure, you can do that! This needs to be executed against the system database, and you have to use the WithDatabase call do to that:

gc.Cypher.WithDatabase("system").CreateDatabase("neo4jclient", true).ExecuteWithoutResultsAsync();

Wait? What’s this true parameter? It’s the ifNotExists parameter which means that we will only create the database if it doesn’t exist.

StartDatabase (system) [4.x]

Well, now we’ve created our database, we need to start it:

gc.Cypher.WithDatabase("system").StartDatabase("neo4jclient").ExecuteWithoutResultsAsync();

If you run this on an already started database, it’ll do nothing!

StopDatabase (system) [4.x]

We’ve started it, now let’s stop it. No surprises here:

gc.Cypher.WithDatabase("system").StopDatabase("neo4jclient").ExecuteWithoutResultsAsync();

Again, as per the StartDatabase method, stopping a stopped database does nothing.

DropDatabase (system) [4.x]

Dropping a database is the quickest way to ‘truncate’ your entire database without resorting to stopping the server. As with CreateDatabase we have two parameters here, the first is the database name, the second is dumpData (default false). If dumpData is false then calling DropDatabase will delete all the data files, if it’s true the data will first be dumped to the location the server has specified (see the documenation for more information)

gc.Cypher.WithDatabase("system").DropDatabase("neo4jclient", true).ExecuteWithoutResultsAsync();

If you run this on a database that has already been dropped – you will get a FatalDiscoveryException thrown, if you want to avoid that, you need to use:

DropDatabaseIfExists (system) [4.x]

Pleasingly this is the same as DropDatabase just that you won’t get the exception if the database doesn’t exist. You have the same option to dumpData or not. It’s all up to you!

gc.Cypher.WithDatabase("system").DropDatabaseIfExists("neo4jclient", true).ExecuteWithoutResultsAsync();

Use [4.x]

Use() allows you to use a database within a query or query part. It means you can avoid having to use WithDatabase, or you can use it within a query that is using another database. Whaaaat?! Confusing?! Yes!

gc.Cypher.Use("neo4jclient").Match("(n)").Return(n => n.As<Node>())

Gives us:

USE neo4jclient
MATCH (n)
RETURN n

BUT you can’t do this for things like the system database calls above, you have to use the WithDatabase clause for that. Use is particularly Useful (ha!) for Fabric use cases.

WithQueryStats [3.5,4.x]

With 3.x when you executed a query, you couldn’t tell what that query had done, in fact, all you could know was that you executed a query (especially if it was a ‘non results’ one). Now you can use the .OperationCompleted event to get the stats of your query:

void OnGraphClientOnOperationCompleted(object o, OperationCompletedEventArgs e)
{        
    e.QueryStats.Dump();
}
gc.OperationCompleted += OnGraphClientOnOperationCompleted;
await gc.Cypher.WithQueryStats.Create("(n:Node {Id: 10, Db: 'neo4j'})").ExecuteWithoutResultsAsync();
gc.OperationCompleted -=OnGraphClientOnOperationCompleted;

Which will give you a QueryStats object that looks like:

Check that out! I added a new Label, 1 Node and 2 Properties!

This isn’t on by default, as it sends back more data over the wire, and if you don’t need it (which so far 100% of people haven’t) then it’s optional!

Neo4jIgnoreAttribute [3.5,4.x]

From a PR by @Clooney24) This will provide the ability to ignore properties for the BoltGraphClient as well as the GraphClient.

So, let’s have our class:

public class Node 
{
    public string Db {get;set;}
    public int Id {get;set;}

    [Neo4jIgnore]
    public string Ignored {get;set;}
}

We’ve defined the Ignored property with the [Neo4jIgnore] attribute, so now when we insert the data:

var node = new Node { 
        Id = 11, 
        Db = gc.DefaultDatabase, 
        Ignored = "You won't see this!" 
    };
await gc.Cypher.WithQueryStats.Create("(n:Node $newNode)").WithParam("newNode", node).ExecuteWithoutResultsAsync();

and then pull it back:

(await gc.Cypher.Use("neo4j").Match("(n)").Return(n => n.As<Node>()).ResultsAsync).Dump("Ignored");

We can see that Ignored is null:

And this isn’t just because we’re ignoring bringing it back, but it’s not in the database either:

Removals

These are all things that have been removed, largely, they were marked as [Obsolete] so you can’t say you weren’t warned! If you are using these, then you need to stay on a 3.x release of the client.

Start

This hasn’t been around since well, 3.0 I think, and has certainly deprecated for a long time. As it wouldn’t work in 3.5 onwards anyway I’m content to remove it.

Create(string, params object[]) 4/n

Again, marked as [Obsolete] and you should be using the alternative Create options instead.

Return<T>(string, CypherResultMode)

This was accidently made public, it was never intended to be, and as it has had at least 1 major version of Obsolete-ness, it’s gone.

StartBit

This pretty much comes with the Start code above.

Gremlin support

If you’re using Gremlin – there are no benefits to this version of the client, so stick with what you have. You should progress to Cypher if you can, it’s actively developed and is very closely linked to the new GQL standards that ISO have started to work on.

From a ‘career’ point of view, this means that learning Cypher is like learning GQL but with maybe a different accent, but not a different language. As GQL becomes a standard, other Graph databases will start to use it, and you’ll be ahead of the game.

Finally

Some other bits of tidying up!

URIs

You may (or may not) have noticed that the URI for the client has changed from: https://github.com/Readify/Neo4jClient to https://github.com/DotNet4Neo4j/Neo4jClient. This means that the project is now part of an Organisation (DotNet4Neo4j) which is focused on things that link Neo4j and .NET together.

Err.

I think that’s about it.

༼ つ ◕_◕ ༽つ

Reactive Neo4j using .NET

Version 4.0 of Neo4j is being actively worked on, and aside from the new things in the database itself, the drivers get an update as well – and one of the big updates is the addition of a Reactive way to develop against the DB.

Now – I’ve not done reactive programming for a long time, I think I did play around with it when .NET 4 was first released, but I have no idea where that blog post has gone – so I may as well start as new.

I found it! Not the post, but the application – MousePath – which is now on GitHub: MousePath – aside from it ‘working’ it’s not performant in any way.

What is Rx/Reactive?

Reactive in .NET is all about the IObservable<T>/IObserver<T> interfaces. They’ve been around since .NET 4, but personally I’ve never really used them. They allow application code to react to data being pushed to it, rather than the more traditional way of requesting the data.

There’s a good book (Intro to Rx) which I will been using to work this out, which is freely available online: http://introtorx.com/ .

Starting off

For this project, we’re going to need the nuget package – which in this case isn’t Neo4j.Driver – but Neo4j.Driver.Reactive. When we add this to our project – and create a driver in the normal way- we can see we now have an ‘RxSession‘ which is an extension method of the IDriver.

So let’s create a reactive session and see what we can see.

RxSession

We get IObservable as opposed to the AsyncSession giving us Tasks

AsyncSession

Doing a Run-ner

So, back to our RxSession, lets do a basic version, just using Run

var session = Driver.RxSession();
var rxStatementResult = session.Run("MATCH (m:Movie) RETURN m.title");
rxStatementResult
    .Records()
    .Subscribe(
        record => Console.WriteLine("Got: " + record.Values["m.title"].As<string>())
    );

In here, we’re hooking up to the ol’ classic Movies database, and simply writing the titles to the screen. NB – Driver is a static property of type IDriver I have defined elsewhere.

The first two lines look pretty much like our normal code – the only real difference being the use of the ‘RxSession‘ as opposed to just ‘Session‘.

Run on an RxSession returns an IRxStatementResult – which has 3 methods we’re interested in, (well actually only 1 at the moment) – Records(), Consume() and Keys().

Records() gets us the records from the database, so the stuff we want to do things with, Consume() whips through those records so we can get an IResultSummary telling us what is going on, and Keys() gets us the keys that are returned, in the simple statement I’ve done – ‘m.title‘ is the only key.

Records() is what we’re using, as we want to deal with the data, Records() return us an IObservable<IRecord> and being IObservable – we need to Subscribe() to it to get the data. Subscribing means we will provide an IObserver that will be notified whenever an IRecord arrives.

In this case, we have the contents being written to the console. Aces.

Quitting

Being a console app – doing tiny amounts of work – I largely don’t need to worry about disposing of my resources, but let’s imagine resource usage is something we do care about. How do you go about disposing of your resources?

IDisposable? INosable! – the IRxSession doesn’t implement IDisposable, instead we have to Close<T>() it – and this is where things have got a little fuzzy for me – I’m not entirely sure I’m closing it correctly.

var session = Driver.RxSession();
var rxStatementResult = session.Run("MATCH (m:Movie) RETURN m.title");
rxStatementResult
    .Records()
    .Subscribe(
        record => Console.WriteLine("Got: " + record.Values["m.title"].As<string>()));

session.Close<IRecord>();

Now, I expect to get either no results, or a smaller subset (depending on the speed of the code running) – what I get is the close being called, but still getting the full amount of data – I suspect I misunderstand what is going on here.

Let’s say I do want a smaller subset – or to quit – how do I do it? Well, the Subscribe() method actually returns an IObservable<IRecord> – which is also IDisposable – so we ‘unsubscribe’ by disposing of our subscriber:

var session = Driver.RxSession();
var rxStatementResult = session.Run("MATCH (m:Movie) RETURN m.title");
var subscription = rxStatementResult
    .Records()
    .Subscribe(record => Console.WriteLine("Got: " + record.Values["m.title"].As<string>()));

await Task.Delay(220);
subscription.Dispose();
session.Close<IRecord>();

The ‘delay’ magic number there is enough time to get some records, but not all, anything less than that gave no results, anything more – all the results. ¯\_(ツ)_/¯

Ooook So – Why Rx?

It seems more complex right? Subscribe(), unsubscribe – no foreach in sight! What’s the point?

My understanding – and this could be/probably is wrong – is that by using Rx – we’re reducing our overheads – i.e. instead of streaming everything, we can just stream what we’re consuming at the time.

The other key benefits come from things like ‘.Buffer‘ and the other commands (Skip, Last, etc) allowing you to stream things in a better way.

One nice thing about Rx in .NET is that it’s not the same as async – you don’t have to have your entire stack in Rx to get the benefits – you can do bits and pieces where you need to – if you’ve got a lot of data maybe it makes sense for a given query.

Neo4j with Azure Functions

Recently, I’ve had a couple of people ask me how to use Neo4j with Azure functions, and well – I’d not done it myself, but now I have – let’s get it done!

  1. Login to your Azure Portal

  2. Create  a new Resource, and search for ‘function app’:

image

  1. Select ‘Function App’ from the Market Place:

image

  1. Press ‘Create’ to actually make one:

image

  1. Fill in your details as you want them

image

I’m assuming you’re reasonably au fait with the setting here, in essence if you have a Resource Group you want to put it into (maybe something with a VNet) then go for it, in my case, I’ve just created a new instance of everything.

  1. Create the function, and wait for it to be ready. Mayhaps make a tea or coffee, have a break from the computer for a couple of mins – it’s all good!

image

  1. When it’s ready, click on it and go into the Function App itself (if it doesn’t take you there!)

  2. Create a new function:

image

  1. We want to create an HttpTrigger function in C# for this instance:

image

  1. This gives us a ‘run.csx’ file, which will have a load of default code, you can run it if you want,

image

and you’ll see an output window appear which will say:

image

Well – good – Azure Functions work, so let’s get a connection to a Neo4j instance – now – for this I’m assuming you have an IP to connect to – you can always use the free tier on GrapheneDB if you want to play around with this.

  1. Add references to a driver

We need to add a reference to a Neo4j client, in this case, I’ll show the official driver, but it will work as well with the community driver. First off, we need to add a ‘project.json’ file, so press ‘View Files’ on the left hand side –

image

Then add a file:

image

Then call it project.json – and yes it has to be that name:

image

With our new empty file, we need to paste in the nuget reference we need:

{
   "frameworks": {
     "net46":{
       "dependencies": {
         "neo4j.driver": "1.5.2"
       }
     }
    }
}

Annoyingly if you copy/paste this into the webpage, the function will add extra ‘closing’ curly braces, so just delete those.

image

If you press ‘Save and Run’ you should get the same response as before – which is good as it means that the Neo4j.Driver package has been installed, if we look at files, we’ll see the ‘project.json.lock’ file which we want to.

image

  1. Code

We want to add our connection information now, we’re going to go basic, and just return the COUNT of the nodes in our DB. First we need to add a ‘using’ statement to our code:

So add,

using Neo4j.Driver.V1;

Then replace the code in the function with:

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
     using (var driver = GraphDatabase.Driver("bolt://YOURIP:7687", AuthTokens.Basic("user", "pass")))
     {
         using (var session = driver.Session())
         {
             IRecord record = session.Run("MATCH (n) RETURN COUNT(n)").Single();
             int count = record["COUNT(n)"].As<int>();
             return req.CreateResponse(HttpStatusCode.OK, "Count: " + count);                  
         }
     }
}

Basically, we’ll create a Driver, open a session and then return a 200 with the count!

  1. Run

You can now ‘Save and Run’ and your output window should now tell you the count:

image

  1. Done

Your first function using Neo4j, Yay!

Neo4jClient turns 3.0

Well, version wise anyhow!

This is a pretty big release and is one I’ve been working on for a while now (sorry!). Version 3.0 of the client finally adds support for Bolt. When Neo4j released version 3.0 of their database, they added a new binary protocol called Bolt designed to be faster and easier on the ol’ network traffic.

For all versions of Neo4jClient prior to 3.x you could only access the DB via the REST protocol (side effect – you also had to use the client to access any version of Neo4j prior to 3.x).

I’ve tried my hardest to minimise the disruption that could happen, and I’m pretty happy to have it down to mainly 1 line change (assuming you’re passing an IGraphClient around – you are right??)

So without further ado:

var client = new GraphClient(new Uri("http://localhost:7474/db/data"), "user", "pass");`

becomes:

var client = new BoltGraphClient(new Uri("bolt://localhost:7687"), "user", "pass");

I’ve gone through a lot of trials with various objects, and I know others have as well (thank you!) but I’m sure there will still be errors – nothing is bug free! So raise an issue – ask on StackOverflow or Twitter 🙂

If you’re using `PathResults` you will need to swap to `BoltPathResults` – the format changed dramatically between the REST version and the Bolt version – and there’s not a lot I can do about it I’m afraid!

Excel & Neo4j? Let’s code that! (VSTO edition)

So you have a new Graph Database, it’s looking snazzy and graphy and all that, but, well – you really want to see it in a tabular format, ‘cos you’ve got this Excel program hanging about, and well – who doesn’t love a bit of tabular data?

Obviously there are loads of reasons why you might want that data – maybe to do a swanky graph, perhaps to pass over to the boss. You can also get that data into Excel in a few ways –

  • Open up from the Web – you can try opening up the REST endpoint in Excel directly – (I say try because quite frankly – it’s not looking like a good option)
  • Create an application to export to CSV – this is easy – writing a CSV/TSV/#SV is a doddle (in any language) but does mean you have to give it to people to run, and that might give more headaches – however it’s an option!
  • Create an Excel Addin that runs within Excel – slightly more complicated as you need to interact with Excel directly – but does have the benefit that maybe you can use it to send data back to the db as well..

As you can imagine, this is about doing the third option – to be honest, I would only ever pick options 2 or 3, and if I’m super honest – I would normally go for option 2 – as it’s the simplest. Option 3 however has some benefits I’d like to explore.

If you want to look at the project – you can find it at: https://github.com/DotNet4Neo4j/Neo4jDriverExcelAddin

I’ll be using the official driver (Neo4j.Driver) and VSTO addins, with VS 2017.

Onwards!

Sidenote

As I was writing this, I was going to do my usual – step-by-step approach, so went to take a screenshot and noticed this:

image

So we’re going to do a quick overview of the VSTO version, then the next post will tuck into the Excel Web version which looks snazzier – but I don’t have an example as of yet…

Onwards again!

Sidenote 2: Sidenote Harder

As the code is on github I’m not going to show everything, merely the important stuff, as you can get all the code and check it out for yourself!

So – pick the new VSTO addin option:

image

And create your project. You’ll end up with something like this:

image

OK, so an addin needs a few things –

  1. A button on the ribbon
  2. A form (yes, WinForm) to get our input (cypher)
  3. The code that executes stuff

The Form

That’s right. Form. Actually – UserControl, but still WinForms (Hello 2000), let’s add our interface to the project, right click and ‘Add New Item’:

image

For those who’ve not had the pleasure before, the key thing to learn is how the Anchors work to prevent your form doing weird stuff when it’s resized.

Add a textbox to the control:

image

Single line eh? That’s not very useful – let’s MULTI-LINE!

Right–click on the box and select properties and that properties window you never use pops up, ready to be used! Change the name to something useful – or leave it  – it’s up to you – the key settings are Anchor and Multiline. Multline should be true, Anchor should then be all the anchors:

image

If you resize your whole control now, you should see that your textbox will expand and contract with it – good times!

Drag a button onto that form and place it to the bottom right of your textbox, and now we need to set the anchors again, but this time to Bottom, Right so it will move with resizing correctly – also we should probably change the Text to something more meaningful than button1 – again – don’t let me preach UX to you! Play around, make it bigger, change the colour, go WILD.

Once your button dreams have been realised – double click on the button to be taken to the code behind.First we’ll add some custom EventArgs:

internal class ExecuteCypherQueryArgs : EventArgs
{
     public string Cypher { get; set; }
}

and then a custom EventHandler:

internal EventHandler<ExecuteCypherQueryArgs> ExecuteCypher;

Then we call that event when the button is pressed, so the UserControl code looks like:

public partial class ExecuteQuery : UserControl
{
    internal EventHandler<ExecuteCypherQueryArgs> ExecuteCypher;

    public ExecuteQuery()
    {
        InitializeComponent();
    }

    private void _btnExecute_Click(object sender, EventArgs e)
    {
        if (string.IsNullOrWhiteSpace(_txtCypher.Text))
            return;

        ExecuteCypher?.Invoke(this, new ExecuteCypherQueryArgs { Cypher = _txtCypher.Text });
    }
}

The Ribbon

OK, we now have a form, but no way to see said form, so we need a Ribbon. Let’s add a new  Ribbon (XML) to our project

image

Open up the new .xml file and add the following to the &lt;group&gt; elements:

<button id="btnShowHide" label="Show/Hide" onAction="OnShowHideButton"/>

Now open the .cs file that has the same name as your .xml and add the following:

internal event EventHandler ShowHide;

public void OnShowHideButton(Office.IRibbonControl control)
{
    ShowHide?.Invoke(this, null);
}

Basically, we raise an event when the button is pressed. But what is listening for this most epic of notifications??? That’s right.. it’s:

ThisAddin.cs

The unfortunate part about going from here on in is that this is largely plumbing… ugh! The code around how to show/hide a form I’ll skip over – it’s all in the GitHub repo and you can read it easily enough.

There are a couple of bits of interest – one is the ThisAddin_Startup method, in which we create our Driver instance:

private void ThisAddIn_Startup(object sender, EventArgs e)
{
     _driver = GraphDatabase.Driver(new Uri("bolt://localhost"), AuthTokens.Basic("neo4j", "neo"));
}

To improve this, you’d want to get the URL and login details from the user somehow, perhaps a settings form – but I’ll leave that to you! – The important bit is that we store the IDriver instance in the addin. We only want one instance of a Driver per Excel, so this is fine.

The other interesting method is the ExecuteCypher method – (which is hooked up to in the InitializePane method) – This takes the results of our query and puts it into Excel:

private void ExecuteCypher(object sender, ExecuteCypherQueryArgs e)
{
    var worksheet = ((Worksheet) Application.ActiveSheet);

    using (var session = _driver.Session())
    {
        var result = session.Run(e.Cypher);
        int row = 1;
        
        foreach (var record in result)
        {
            var range = worksheet.Range[$"A{row++}"]; //TODO: Hard coded range
            range.Value2 = record["UserId"].As<string>(); //TODO: Hard coded 'UserId' here.
        }
     }
}

Again – HardCoded ranges and ‘Columns’ (UserId) – you’ll want to change these to make sense for your queries, or even better, just make them super generic.

Summing Up

So now we’re at this stage, we have an Excel addin using VSTO that can call Cypher and display the results, there are things we probably want to add – firstly – remove all the hard coded stuff. But what about being able to ‘update’ results based on your query?? That’d be cool – and maybe something we’ll look at in the next addin based post (on Web addins).

BrowserStack Visual Studio 2015 Extension

Originally posted on: http://geekswithblogs.net/cskardon/archive/2016/01/08/browserstack-visual-studio-2015-extension.aspx

Do you find that you want browserstack but once again, the extension lets you down by not installing into VS2015.

Worry not! As with VS2013 (that I wrote about here) it’s a case of editing the vsixmanifest file and targetting a different visual studio version.

Download the extension from the VS gallery (here) open it up with 7zip/winzip/windows and edit the extension.vsixmanifest file, change the following lines from:

<Installation InstalledByMsi="false" AllUsers="true">
  <InstallationTarget Version="11.0" Id="Microsoft.VisualStudio.Pro" />
  <InstallationTarget Version="11.0" Id="Microsoft.VisualStudio.Premium" />
  <InstallationTarget Version="11.0" Id="Microsoft.VisualStudio.Ultimate" />
</Installation>

to:

<Installation InstalledByMsi=”false” AllUsers=”true”>

  <InstallationTarget Id=”Microsoft.VisualStudio.Pro” Version=”[12.0, 15.0]” />

</Installation>

Save the file – make sure the archive is updated, double click and install! BOOM!

The three stages of *my* database choice

Originally posted on: http://geekswithblogs.net/cskardon/archive/2015/12/03/the-three-stages-of-my-database-choice.aspx

Prologue

I write and run a website called Tournr, a site to help people run competitions, it helps them organise and register for competitions, keeping the scores online and taking some of the pain of competition management out for them. I began it due to a badly run competition I attended (and ahem ran) a few years ago – and I wanted to make it better for first-timers (like myself) and old-hands alike. This post is about the database decisions and pains I’ve been through to get where I currently am, it’s long, and the TL;DR; I changed my DB.

Chapter 1 – SQL Server

I’m a .NET developer through and through – no bad thing, but it does tend to lead you in a certain train of thought – namely the Microsoft Stack, (WISA – Windows, IIS, SQL Server, Asp NET). Personally, I don’t have the time, well, more the inclination to learn a new language when I’m comfortable in .NET and it does what I want it to do – so I created my first version.

tournr

I also was predominantly a desktop developer, this was my first real foray into the world of web development, so the styling, colour choices in a word – sucked. More importantly, the backend was progressing slowly. At the early stages of any project, changes occur rapidly some ideas which seem great begin to lose their shine after a week, or when someone else hears them and says ‘no, just no’.

So Tournr was based on SQL Server using the Entity Framework as it’s ORM – again – standard practice. I started to get got fed up with writing migration scripts. I’m more your Swiss Army Knife developer, good at a lot of things, but not a super-specialized-amazeballs at one thing in particular – a generalist if you will – and I found the time spent migrating my database structure, writing SQL etc was delaying me from actually writing features. I know people who can reel out SQL easily and are super comfortable with it, and I’m ok, I can write queries for creating/deleting/joining etc, but not as efficiently as others.

Chapter 2 – RavenDB

Skip along 6 months, and I’d been playing with RavenDB at my workplace, and thought it looked like it might be a good fit for Tournr. So I took a month or so to convert Tournr to use Raven instead of SQL Server, and man alive! that was one of my best ever decisions, I felt freer in terms of development than I had for ages, instead of working out how my classes would fit, and whether I needed yet another lookup table, I could write my classes and just Save. Literally. A little note here: Raven has first class .NET integration, it is very easy to use.

I procrastinated for a while after the initial conversion and finally got Tournr released using RavenHQ for hosting the DB and life was good – including a new Logo.

Print

I could add new features relatively easily. Over time I found myself adding things into my class structures to make the queries simpler, and ended up doing a little bit of redundancy. As an example I would have a list of Competitors in a Class (not a code class, but a competition class – like Junior or Women’s for example), and if a competitor was registered in two Classes, they would in essence be copied into both, so my Tournament would have 2 Classes with the same Competitor in both. I won’t bore with details, but this encroachment started to happen a little bit more.

Brief interlude

I’m aware that anytime you write something about how you struggled with <technology>, the developers and users who love it and are passionate about it will think you’re:

a) doing it wrong
b) don’t understand the `<technology>`
c) vindictive because something went wrong
d) insert your own reason here!

It’s natural, people make decisions which they get invested in, and they want their decisions to be positively reinforced, if you read something saying ‘Oh I left <technology> because it was <insert your own negative phrase here>’. It’s like they’ve slapped you and said you’ve made the wrong choice.

So to those people. It was just that way for me, it’s not a personal attack on you -or- Raven, or indeed SQL Server.

I was talking with my partner about a new feature I wanted to add in, and as we talked about it, the structure started to become apparent, she drew a circle and lines going into it. I made the glib statement somewhere along the lines of “the problem is that what you’ve drawn there is basically a graph, it’s a bit more complex than that”. To which she responded “Why don’t you use the graph db?”.

I had no good answer. I’d been using Neo4j for a good few years so it’s not like I didn’t get it. Obviously it’s a big decision, switching from 1 DB to another is never a small thing, let alone from one type (document) to another (graph). Sure – I’d done it in the past from Relational to Document, but at that point *no-one* was using it, so it only affected me. This time I’d have users and Tournaments.

Now, Tournr isn’t used by many people at the moment, this is a blessing and a curse – the curse being that I’d love it to be used by more people 🙂 The blessing is that I can monitor it very closely and keep tabs on how the conversion has gone. Hooking in things like RayGun means that getting near instant notification of any error combined with quick code turn-around I can respond very quickly.

Long and short of it. I thought ‘<expletive> it!’, and set to work…..

Before jumping there, lets look at the positives and negatives of using Raven,

Positives:
  • Extremely fast to get up and running (I think it’s fair to say without Raven Tournr would not have been launched when it was)
  • Fits into C# / .NET code very well
Negatives:
  • You really need to buy into Ayende’s view of how to use the Database, this isn’t a bad thing in itself, but it does restrict your own designs.

 

Chapter 3 – Neo4j

At the point you take the plunge it’s important to get a quick win, even if (as it turns out) it’s superficial and full of lies and more LIES! I’m going to give a bit of an overview of Tournr’s structure, not going super deep – you don’t need to know that. Tournr was initially an ASP.NET MVC3 application, which was migrated to MVC5, along the way it stuck with the ASP.NET Membership system using first the Entity Framework version, and then a custom rolled RavenDB based version.

Whilst doing this conversion the *only* thing I allowed myself to do aside from the DB change was update the Membership to use ASP.NET Identity – and that was for two reasons –

1. There wasn’t a Neo4j based Membership codebase that I could see – so I’d have had to roll my own, and
2. There is a Neo4j Identity implementation (which I *have* helped roll).

Membership

Long story short – I added the Neo4j.Aspnet.Identity nuget package to my web project and switched out the Raven Membership stuff, this involved adding some identity code, setting up OWIN and other such-ness. The real surprise was that this worked. No problems at all – this was the quick win. I thought to myself – this is something that is not impossible.

Conversion – The rest

What? Membership and ‘The rest’ – it’s not exactly partioning the post is it Chris? Well – no, and the reason is this – when I switched the membership – it compiled, started and let me login, register etc. Obviously I couldn’t load any tournaments, or rather I could, but I couldn’t tie the user accounts to them. When I switched the pulling of Tournaments etc all bets were off.

I like to go cold turkey. I removed the RavenDB nuget package from the project and winced at the hundreds of red squiggles and build errors. All that could be done from this point was a methodical step by step process of going through controllers replacing calls to Raven with calls to my new DB access classes. Anyhews, that aside – I ended up with an interface with a bucket load of methods.

Model 1

Woah there! You’re thinking – I think you missed a step there, what about the data model design – yes – you’re of course right. Prior to my conversion I had drawn out a model we’ll call this Model 1. This was (as you can probably guess from the name) wrong. But that didn’t stop me, and that’s partly down to my personality – if I’m not doing something – I find it easy to get bored and then spend time reading the interwebs. Also – I know I’m going to find out some stuff that will change the model, no point in being too rigid to it.

In this model – I’d seperated out a lot of things into individual nodes, for example – a User has a set of properties which are grouped in a class together representing Personal Registration details – things like country flag etc, and I had the model:

(User)-[:HAS_PERSONAL_DETAILS]->(PersonalDetails)

So I wrote a chunk of code around that.

Something you will find is that Neo4j doesn’t store complex types – simple arrays of simple types are cool, Dictionaries and objects are out. So you can quite easily separate out into individual nodes like above, and first cut – well – that’s the route I took.
So I plugged away, until I hit some of the bigger classes, this is where Raven had given me an easy run – Oh hey! You want to store all those nested classes? NO PROBLEM! That is awesomely powerful – and gives super super fast development times. Neo4j not so forgiving. So, taking ‘Model 1’ as the basis I start to pick out the complex objects. Then EPIPHANY

Model 2 – The epiphany

In my view, for complex types which really are part of a Tournament or indeed a User, and in particular things I wasn’t going to search by, why create a new Node? Trade off – bigger nodes, but less of them – queries (or cyphers) become a bit simpler, but can’t query as easily against the complex embedded types.

Maybe I needed an inbetween – where some complex types *were* nodes, and some were just serialized with the main object. Weird. A _middle ground_, can you have that in development?

So Model 2 takes Model 1 and combines some of the types which really didn’t need to be separate nodes. So Personal Details moved into the User, as I had no need to query on the data in there (and if I _do_ need to at a later date, well – I can add it then).

Special note for .NET devs – if you try to put a CREATE into Neo4j with a type with a complex type for a property – Neo4j will b0rk at you. To get around this – you’ll need to provide a custom Json Converter to the Neo4jClient (obvs if you’re not using Neo4jClient this is totall irrelavent to you). There are examples of this on StackOverflow – and I imagine I’ll write some more on it later – probably try to update the Neo4jClient Wiki as well!

Now, so far I imagine there are Top-Devs (TM)(R) slapping their foreheads over the general lack of planning, well hold onto your pants, let’s enter the heady world of TESTING.

I know what TDD is, I know what BDD is, I’m pretty certain I know what DDD is – but for Tournr I don’t really practice them. A few reasons – and I don’t really want to get into some sort of standards war here, but in a nutshell – Tournr wouldn’t be live if I’d tested the heck out of it. In the areas that matter – the important calculations etc, I have tests, but for some things – I just don’t. Quick note for potential hirers:  I do write tests professionally, use NCrunch etc, but this is very much a personal project and I take all the heat for it, and it’s a risk I’m willing to take at the moment.

So, from Tournrs once I’d been through the controllers and got it all compiling, I started testing my codebase. Funny thing – when you write a lot of code which for the majority of time *doesn’t compile*, issues do creep in. Mostly (in this case) it was related to complex types I’d missed or the missing of a closing brace in the Cypher.

>> Cypher

I’m not going to go into this very deeply either, but Cypher is amazeballs, think of it as the SQL of the Graph DB world (well, Neo4j world – actually not anymore – you can tell this post has been in the drafts for a while – check out OpenCypher), it’s clear concise and yes – like SQL you can go wrong. You might think that you don’t want to learn Yet Another Language when you know SQL – so why not use something like OrientDB – but think about it from another way. You use SQL to interact with Relational DB, with tables, foreign keys etc. You perform Joins between tables – to use that in a GraphDB would be quite a mental leap – and confuses matters – you end up having the same keyword meaning different things for different databases – you could end up writing a ‘select’ statement in your code against both DB types. With Cypher the language is tailored to the DB, and as such describes your queries from a Node / Relationship point of view, not a Tables point of view.

The changes I mainly did involved adding attributes like ‘JsonIgnore’ to my classes to prevent Neo4j serializing them (or attempting to), partly as it meant I could get development up and running faster, but also from the point of view of Migration. One of the problems with the conversion (indeed *any* conversion) is keeping existing elements, and that means translation. Raven stores documents key’d by the type – so if I store a ‘Tournament’, it is stored as a Tournament. When I query – I bring back a Tournament. Ah, but I’ve just JsonIgnored my properties – so when I bring back – it’s missing things.

Migration

Obviously – I have elements in database A and I want them in database B, how do we achieve those goals? Bearing in mind – I don’t want them to change their passwords or not be able to login. Luckily I store passwords as plain text — HA! Not really, in practical terms, I have changed the way the passwords are hashed by switching to the Identity model, and as a consequence – there is nothing I can do :/ Existing users have to reset their passwords – now – this is BAD. How do you write an email like that? ‘Hi, I decided unilaterally to change the backend – now you need to reset your password – sucks to be you!’ – of course not. A more diplomatic approach is needed – specifically, the migration should only take place in the quietest possible period – once again a bonus of the ‘not used much’ scenario I find myself in.

All the other migration requirements are relatively simple, of course I have to split out bits that need to be split out, create relationships etc, but none of that affects the users.

The biggest headache I thought would be getting stuff from Raven and then putting into Neo4j. Take a Tournament for example, in it, I had a List of Class, which in the Neo4j world is now represented as (Tournament)-[:HAS_CLASS]->(Class) so in the codebase for the Neo4j version, I removed the ‘Classes’ property. But now I can’t deserialize from Raven, as Tournament no longer has Classes.

This is where judicious use of Source Control (which we’re *all* using right?????) comes into play. Obviously at this point I’ve done a shed load of checkins – on a different branch – ready for the big ol’ merge, so it’s relatively easy to browse the last checkin before the branch and copy the Tournament class from there.

If I just whack in the class, the compiler will throw a wobbly, not to mention the Neo4j and Raven code will be unsure of which Tournament I mean.

So, let’s rename to ‘RavenTournament’ (cleeeever), but coming back to the point made a while ago – Raven can’t deserialize into RavenTournament as it’s looking for Tournament, oh but wait. It can. Of course it can, simply as well. The standard query from Raven’s point of view would be:

session.Query<Tournament>()

to get all the Tournaments. If I switch to:

session.Query<RavenTournament>()

it will b0rk, but, if I add:

session.Query<Tournament>().ProjectFromIndexFieldsInto<RavenTournament>()

I hit the mother load, property wise RavenTournament is the same as Tournament was pre-Neo4j changes, and Raven can now deserialize.

A little word about IDs

By default Raven uses ids in the format: <type>/long, so a Tournament might be: Tournament/201. You can (and I did) change this so for example, I used ‘-‘ as the splitter: Tournament-201, and actually for Tournament – I just used a long. I can’t really change the IDs, or rather – I don’t want to, doing so means that existing links to tournaments are made invalid, of course I could add some sort of mapping code, but that seems like more effort that I shouldn’t need to do. So, Tatham to the rescue (this is Tatham Oddie of Neo4jClient fame) with SnowMaker – an Azure Storage based ID generator. I won’t go into the how’s your fathers about how I’ve used it – it’s a pretty simple concept that you can look up and enjoy. Needless to save it’s made the conversion work.

Epilogue – Post Conversion Analysis

So, codewise am I in a better shape with the conversion – was it worth it? I think so – but thinking isn’t the same as knowing – so let’s fire up some analysis with the excellent NDepend. First we’ve got to define the baseline, and in this case we’re going to set that as the last RavenDB based version of the codebase (comparing to the SQL version would be pointless as too much has changed inbetween), and then define the codebase to compare to – well that’s the most recent Neo4j based version (actually it’s a version I’m currently working on – so includes some new features not in the Raven one – so think a bit more of ‘Raven Version+’.

The first and most basic stats come from the initial dashboard –

image

Here I can see general metrics, and things like LOC, Complexity have gone down – generally – a good thing, but the number of types has increased a lot.

Less code but more types? Query-wise with Raven you can just pull out the same objects as you insert – with Neo4j, I’ve found myself using more intermediaries, which is fine, and is part of the process – in fairness as time has gone on, I’ve realised a few of these are not used as much as I thought – and I can group them better – so if I was stopping dev now, and just maintaining – I’d expect the number to drop, in practical terms – does it matter? Probably not – a few lines of code here and there – might make it more maintainable – but it’s worth thinking about the ramifications of switching to a rarely used DB (Raven or Neo4j) – I could have a 50% drop in code size, but the technology still requires more of a leap to get used to than a basic relational DB implementation.

What about new dependencies? What have I added, what have I removed?

One of the great things about NDepend (among many) is the ability to write CQL (Code Query Language) – a ‘LINQ’ style querying language for your code base – so Third Party types used that weren’t before:

from t in ThirdParty.Types where t.IsUsedRecently()
select new {
t,
t.Methods,
t.Fields,
t.TypesUsingMe
}

gives us:

image

And types which were used before and now aren’t:

from t in codeBase.OlderVersion().Types where t.IsNotUsedAnymore()
select new {
t,
t.Methods,
t.Fields,
TypesThatUsedMe = t.TypesUsingMe
}

image

There are more metrics and NDepend has a lot of things to look at, and I’m wary of making this post overly long, and I neglected to set my baseline properly to show the trend charts (bad me), ongoing though I’m keeping track of my quality to ensure it doesn’t take a dive. (by the by – Patrick of NDepend has given me a copy of NDepend, you should know that, it is genuinely useful though – but do you know if I’m saying that or am I a suck up lacky???)

 

Things I’ve Learned

  1. First and foremost is that taking on a conversion project of something you have built from scratch is totally doable – it’s hard, and quite frankly dispiriting to have your codebase not compile for a couple of days, and then spend days fixing up the stuff that’s b0rked.
  2. You can spend a long time thinking about doing something – sometimes you just have to do it – and it’s ok to be wrong.
  3. Don’t be afraid to throw stuff away, if no-one is using something – delete it, if your model is wrong, redo.
  4. Fiddler is your friend with Neo4j in Async mode,
  5. I used short cuts – the migration was a one off – the code is a quick console app that does it in a dodgy but successful way that had ZERO tests. That’s right ZERO.