Better Know APOC #4 : apoc.coll.sort*
Neo4j.Version | 3.3.4 |
APOC Version | 3.3.0.2 |
If you haven’t already, please have a look at the intro post to this series, it’ll cover stuff which I’m not going to go into on this.
OK, ‘apoc.coll’ has 43 (that’s right – 43) functions and procedures, but I’m only going to cover the ‘sort’ ones for this post – why? Because a post containing 43 different functions – whilst a good % of the overall, would be way too long.
As it is, with ‘sort’ we have 4 functions:
- apoc.coll.sort
- apoc.coll.sortMaps
- apoc.coll.sortMulti
- apoc.coll.sortNodes
The Whys
These are methods to sort collections, the clue is in the name, but why do we need them? We can sort in Cypher right? We have ‘ORDER BY
‘, who wrote this extra bit of code that has no use?? Who?!?!
Hunger
. Hmmm given his pedigree we may have to assume this was done for a reason… Let’s explore that a bit with the apoc.coll.sort
method…
apoc.coll.sort
This is your basic sort method, given a collection, return it sorted. It doesn’t matter what type the collection is, it will sort it.
Parameters
Just the one for in and one for out, the in is the collection to sort, the out is the sorted collection.
Examples
We’ll look (for this case) at doing it the traditional Cypher way, and then the APOC way.
The Cypher way
It’s worth seeing the Cypher way so you can appreciate sort, this is based on this question on Stack Overflow.
We’ll have a collection which is defined as such:
WITH [2,3,6,5,1,4] AS collection
Let’s sort this the Cypher way – easy!
WITH [2,3,6,5,1,4] AS collection RETURN collection ORDER BY ????
Errr, ok, looks like we’re gonna need to tap into some unwinding!
WITH [2,3,6,5,1,4] AS collection UNWIND collection AS item WITH item ORDER BY item RETURN collect(item) AS sorted
that’s got it! So we UNWIND
the collection, then WITH
each item (ORDER BY
) we then COLLECT
them back again.
The APOC way
WITH [2,3,6,5,1,4] AS collection RETURN apoc.coll.sort(collection) AS sorted
That’s a lot easier to read, it’s also a lot easier to use inline. The Cypher version above might look ok, but imagine you have a more complicated query, and you need to either do multiple sorts, or even just anything extra, it can quickly become unwieldy.
apoc.coll.sortMaps
A Map (or Dictionary for those .NETters out there) is the sort of thing we return from Neo4j all the time, and this function allows us to sort on a given property of a Map.
Examples
For these examples, we’ll have ‘coll’ defined as:
WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll
An array of maps, with an ‘Str
‘ property, and a ‘Num
‘ property.
Sort by string
WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll RETURN apoc.coll.sortMaps(coll, 'Str')
Returns us a list of the maps, looking like:
╒══════════════════════════════════════════════════════════════════════╕ │"apoc.coll.sortMaps(coll, 'Str')" │ ╞══════════════════════════════════════════════════════════════════════╡ │[{"Str":"A","Num":4},{"Str":"B","Num":3},{"Str":"C","Num":2},{"Str":"D│ │","Num":1}] │ └──────────────────────────────────────────────────────────────────────┘
In which we can see the maps go from ‘A’ to ‘D’
Sort by Number
WITH [{Str:'A', Num:4}, {Str:'B', Num:3}, {Str:'D', Num:1}, {Str:'C', Num:2}] AS coll RETURN apoc.coll.sortMaps(coll, 'Str')
Unsurprisingly, this gets us the following:
╒══════════════════════════════════════════════════════════════════════╕ │"apoc.coll.sortMaps(coll, 'Num')" │ ╞══════════════════════════════════════════════════════════════════════╡ │[{"Str":"D","Num":1},{"Str":"C","Num":2},{"Str":"B","Num":3},{"Str":"A│ │","Num":4}] │ └──────────────────────────────────────────────────────────────────────┘
Which goes from 1 to 4.
Sort order is Ascending, there is no way to do a descending sort. You basically do a ‘reverse’ to get the sort the other way.
apoc.coll.sortMulti
This is the equivalent of doing a ‘Sort, Then By’ – so if I take the ‘sortMaps’ function above and run it like so:
WITH [{First:'B', Last:'B'}, {First:'A', Last:'A'}, {First:'B', Last:'A'}, {First:'C', Last:'A'}] AS coll RETURN apoc.coll.sortMaps(coll, 'First')
I get:
╒══════════════════════════════════════════════════════════════════════╕ │"apoc.coll.sortMaps(coll, 'First')" │ ╞══════════════════════════════════════════════════════════════════════╡ │[{"Last":"A","First":"A"},{"Last":"B","First":"B"},{"Last":"A","First"│ │:"B"},{"Last":"A","First":"C"}] │ └──────────────────────────────────────────────────────────────────────┘
The problem here is the two elements:
{"Last":"B","First":"B"},{"Last":"A","First":"B"}
I want these to be the other way around, so I have to switch to ‘Multi’:
WITH [{First:'B', Last:'B'}, {First:'A', Last:'A'}, {First:'B', Last:'A'}, {First:'C', Last:'A'}] AS coll UNWIND apoc.coll.sortMulti(coll, ['^First', '^Last']) AS unwound RETURN unwound.First AS first, unwound.Last AS last
This get’s me:
╒═══════╤══════╕ │"first"│"last"│ ╞═══════╪══════╡ │"A" │ "A" │ ├───────┼──────┤ │"B" │ "A" │ ├───────┼──────┤ │"B" │ "B" │ ├───────┼──────┤ │"C" │ "A" │ └───────┴──────┘
One this to note here – (and I think it’s quite important) is that this is the only method that defaults to Descending order. To get Ascending search, you have to prefix columns with a ‘^’ character (as I’ve done in this case).
apoc.coll.sortNodes
Nearly there! This takes a collection of nodes and sorts them on 1 property – so, let’s add some nodes:
CREATE (n1:CollNode {col1: 1, col2: 'D'}) CREATE (n2:CollNode {col1: 2, col2: 'C'}) CREATE (n3:CollNode {col1: 3, col2: 'B'}) CREATE (n4:CollNode {col1: 4, col2: 'A'})
And let’s do a sort:
MATCH (n:CollNode) WITH apoc.coll.sortNodes(COLLECT(n), 'col2') AS sorted UNWIND sorted AS n RETURN n.col1 AS col1, n.col2 AS col2
Now, you could argue this adds little to the party as you can already ORDER BY, and by and large you’re right – the nice thing about the apoc version is that you can call it as I have above, rather than having to do the sort afterwards. Having said that, ORDER BY
does have a DESC
keyword as well, which sortNodes
does not :/
Conclusions
apoc.coll.sort*
is useful, that’s the main thrust, some are more useful than others, and there are a few omissions (like the ability to sort desc for all but the sortMulti
method) which could be good simple pull requests.
They are what they are, sorting methods 🙂