Better Know APOC #4 : apoc.coll.sort*

Neo4j.Version 3.3.4
APOC Version 3.3.0.2

If you haven’t already, please have a look at the intro post to this series, it’ll cover stuff which I’m not going to go into on this.


OK, ‘apoc.coll’ has 43 (that’s right – 43) functions and procedures, but I’m only going to cover the ‘sort’ ones for this post – why? Because a post containing 43 different functions – whilst a good % of the overall, would be way too long.

As it is, with ‘sort’ we have 4 functions:

  • apoc.coll.sort
  • apoc.coll.sortMaps
  • apoc.coll.sortMulti
  • apoc.coll.sortNodes

The Whys

These are methods to sort collections, the clue is in the name, but why do we need them? We can sort in Cypher right? We have ‘ORDER BY‘, who wrote this extra bit of code that has no use?? Who?!?!

When Hunger strikes, run for cover

Hunger. Hmmm given his pedigree we may have to assume this was done for a reason… Let’s explore that a bit with the apoc.coll.sort method…

apoc.coll.sort

This is your basic sort method, given a collection, return it sorted. It doesn’t matter what type the collection is, it will sort it.

Parameters

Just the one for in and one for out, the in is the collection to sort, the out is the sorted collection.

Examples

We’ll look (for this case) at doing it the traditional Cypher way, and then the APOC way.

The Cypher way

It’s worth seeing the Cypher way so you can appreciate sort, this is based on this question on Stack Overflow.

We’ll have a collection which is defined as such:

WITH [2,3,6,5,1,4] AS collection

Let’s sort this the Cypher way – easy!

WITH [2,3,6,5,1,4] AS collection
 RETURN collection ORDER BY ????

Errr, ok, looks like we’re gonna need to tap into some unwinding!

WITH [2,3,6,5,1,4] AS collection
UNWIND collection AS item
WITH item ORDER BY item
RETURN collect(item) AS sorted

that’s got it! So we UNWIND the collection, then WITH each item (ORDER BY) we then COLLECT them back again.

The APOC way

WITH [2,3,6,5,1,4] AS collection
RETURN apoc.coll.sort(collection) AS sorted

That’s a lot easier to read, it’s also a lot easier to use inline. The Cypher version above might look ok, but imagine you have a more complicated query, and you need to either do multiple sorts, or even just anything extra, it can quickly become unwieldy.

apoc.coll.sortMaps

A Map (or Dictionary for those .NETters out there) is the sort of thing we return from Neo4j all the time, and this function allows us to sort on a given property of a Map.

Examples

For these examples, we’ll have ‘coll’ defined as:

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll

An array of maps, with an ‘Str‘ property, and a ‘Num‘ property.

Sort by string

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll
RETURN apoc.coll.sortMaps(coll, 'Str')

Returns us a list of the maps, looking like:

╒══════════════════════════════════════════════════════════════════════╕
│"apoc.coll.sortMaps(coll, 'Str')"                                     │
╞══════════════════════════════════════════════════════════════════════╡
│[{"Str":"A","Num":4},{"Str":"B","Num":3},{"Str":"C","Num":2},{"Str":"D│
│","Num":1}]                                                           │
└──────────────────────────────────────────────────────────────────────┘

In which we can see the maps go from ‘A’ to ‘D’

Sort by Number

WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll
RETURN apoc.coll.sortMaps(coll, 'Str')

Unsurprisingly, this gets us the following:

╒══════════════════════════════════════════════════════════════════════╕
│"apoc.coll.sortMaps(coll, 'Num')"                                     │
╞══════════════════════════════════════════════════════════════════════╡
│[{"Str":"D","Num":1},{"Str":"C","Num":2},{"Str":"B","Num":3},{"Str":"A│
│","Num":4}]                                                           │
└──────────────────────────────────────────────────────────────────────┘

Which goes from 1 to 4.

Sort order is Ascending, there is no way to do a descending sort. You basically do a ‘reverse’ to get the sort the other way.

apoc.coll.sortMulti

This is the equivalent of doing a ‘Sort, Then By’ – so if I take the ‘sortMaps’ function above and run it like so:

WITH [{First:'B', Last:'B'}, {First:'A', Last:'A'}, {First:'B', Last:'A'}, {First:'C', Last:'A'}] AS coll
RETURN apoc.coll.sortMaps(coll, 'First')

I get:

╒══════════════════════════════════════════════════════════════════════╕
│"apoc.coll.sortMaps(coll, 'First')"                                   │
╞══════════════════════════════════════════════════════════════════════╡
│[{"Last":"A","First":"A"},{"Last":"B","First":"B"},{"Last":"A","First"│
│:"B"},{"Last":"A","First":"C"}]                                       │
└──────────────────────────────────────────────────────────────────────┘

The problem here is the two elements:

{"Last":"B","First":"B"},{"Last":"A","First":"B"}

I want these to be the other way around, so I have to switch to ‘Multi’:

WITH [{First:'B', Last:'B'}, {First:'A', Last:'A'}, {First:'B', Last:'A'}, {First:'C', Last:'A'}] AS coll
UNWIND apoc.coll.sortMulti(coll, ['^First', '^Last']) AS unwound
RETURN unwound.First AS first, unwound.Last AS last

This get’s me:

╒═══════╤══════╕
│"first"│"last"│
╞═══════╪══════╡
│"A"    │  "A" │
├───────┼──────┤
│"B"    │  "A" │
├───────┼──────┤
│"B"    │  "B" │
├───────┼──────┤
│"C"    │  "A" │
└───────┴──────┘

One this to note here – (and I think it’s quite important) is that this is the only method that defaults to Descending order. To get Ascending search, you have to prefix columns with a ‘^’ character (as I’ve done in this case).

apoc.coll.sortNodes

Nearly there! This takes a collection of nodes and sorts them on 1 property – so, let’s add some nodes:

CREATE (n1:CollNode {col1: 1, col2: 'D'})
CREATE (n2:CollNode {col1: 2, col2: 'C'})
CREATE (n3:CollNode {col1: 3, col2: 'B'})
CREATE (n4:CollNode {col1: 4, col2: 'A'})

And let’s do a sort:

MATCH (n:CollNode)
WITH apoc.coll.sortNodes(COLLECT(n), 'col2') AS sorted
UNWIND sorted AS n
RETURN n.col1 AS col1, n.col2 AS col2

Now, you could argue this adds little to the party as you can already ORDER BY, and by and large you’re right – the nice thing about the apoc version is that you can call it as I have above, rather than having to do the sort afterwards. Having said that, ORDER BY does have a DESC keyword as well, which sortNodes does not :/

Conclusions

apoc.coll.sort* is useful, that’s the main thrust, some are more useful than others, and there are a few omissions (like the ability to sort desc for all but the sortMulti method) which could be good simple pull requests.

They are what they are, sorting methods 🙂

Leave a Reply