Let’s go out a little and find things further away. This one around this one’s interesting. This one is interesting because it actually has many sub polygons. I mentioned that we’ve got multi polygons here. I will zoom in and show you. Furthermore, I think there was some over here. Here we go. This is interesting. So, this province Hartland has got a hole over here.
So, obviously, this particular town decided it didn’t want to be in that province. It gets quite complicated, quite amazing. An important point about this is that the border of this polygon is very complicated. So, we have got every tiny little detail at the finest resolution.
So, the polygon I’m drawing over here is actually very, very complicated. It has data for tens of thousands of points, which of course is quite impressive when you consider how fast the drawing is on this map. Also, it is impressive when I show you the next algorithm, which is a calculation of area. I’m going to select a button down here, and there we get the areas. And the area calculation here is say taking the sum of all the shells and then subtracting all the holes and giving you the correct combined area. And it does it instantly.
Now, I demonstrated this last year as a demo as well and it worked pretty well. But something that did not work last year was the calculation of distances. That went pretty badly. It took over a minute. And I use that minute to try to explain what the problem was with the distance calculation. This year, I’m pleased to report I have done an optimization. Let’s click on the distance selection over here. We should see some distances. There we go. That took, what do you think? Four seconds? I don’t think it took very long at all.
Now, the way the algorithm originally worked was it would break every polygon down into all of the little line segments that they made of tens of thousands of line segments for each polygon. And then, it would compare every line segment in one polygon to every other line segment in another that’s on the order of N². It explodes. The only way that you can do this fast is through simplifications. So, I took two approaches. One approach I can show you first is the convex hole. If we do a convex hole, that is a simplification of the polygon.
So, this is another algorithm that we’re demonstrating and I will talk about in the presentation. Convex can be thought of as taking an elastic band around the polygon. So, it makes the polygon of hundreds of thousands of points, something maybe that contains 10 to 20 points. And then suddenly, the comparisons are very, very, very fast. However, they don’t give the same answer. So, this is only an approximation. For example, if you look down here, this line is not correct because the actual border of Skåne is down here. I take away the convex hole and it’ll recalculate.
You’ll see it actually draws the line to a different point. It found a different closest point. So, the convex hole is not giving the right answers. It’s the correct answer for the convex hole, it’s not the right answer for these complex polygons. So, how did I get this to be fast? I’m going to show you when you get back to the presentation. Right. One last thing I will show you before we move to the presentation is the other thing that I did to make this fast is I stopped at calculating distances between every sub polygon. But I can put that back on again. Put that back on again, you see it’ll take a little longer but we still managed to get it done in a reasonable time.
But we’re going to end up with a lot of distances because it’s basically going to treat here… There we go. Every single sub polygon is treated as an independent polygon. It’s calculating the distances between them all, and that’s too much. So, we don’t have that on by default. So needless to say, today’s demo is a lot richer than it was last year. And we’ve figured out quite a few things that we didn’t know last year. So, I’m going to switch back to the presentation and explain these algorithms to you. Right. I’d like to talk about five algorithms.
We don’t have much time left, so I’ll go through them each quite quickly. And feel free to ask questions in the chat. We’ll try to answer them. The first one, point in polygon. I like that one a lot because I wrote that myself two years ago, and then Steph took it and ported it to geographic coordinate systems and moved it into the library. But the idea is very, very simple. So, I’ll just explain briefly. First of all, you need your polygon, which we’ve talked about. Well, this is the polygon work. We need to find points of interest. Yeah, this I need to explain first. How do you display points on a map if you need?
If you’ve got a database with hundreds of thousands of points, you only want to show points that either on the screen or in the circle, like I showed you in the demo, or within a polygon. And this is the way our demos working. It’s actually combining two functions, distance which I’ve highlighted there, and the line below it within a polygon. And the reason we use both of those functions is that we want to benefit from the spatial index that Neo4j has built-in. And within polygon function is something that we’ve added in this library. Neo4j doesn’t understand it, and therefore it won’t be able to use it for this special index. However, Neo4j does understand distance. So while we use both of those, when cipher sees this query, it knows that it can take that distance one and back it with an index and get a fast result and then do a post-filtering on the within a polygon.
So, what do you do if you want to use an index, but you don’t want that circle? You don’t want the distance function, you only want the polygon. There is a trick, and I’m going to explain that with this equation. If you compare the previous line, just go back one. This is a gap in the query there.
What I’m going to insert into that gap is some preparation that I can use later on. So, have a look at what’s added there. I’ve added a function called the bounding box. I passed the polygon to that function, it’s gonna return an object called bounding box which is a map containing two keys, max, and min. And now look what I’ve done. Instead of using a distance equation, I’ve said, “I need my points to be greater than min and less than max.” Neo4j cipher understands that as something it can use an index for. So what it’s done is, we’ve put a box around the polygon and we passed that to the index.
So, it’s going to find everything in that box, and then do a postal ring with the polygon. This is great. So we’ve tricked Neo4j into index backing without having any knowledge of polygons or any knowledge of what the within the function, within polygon function or the bounding box function means. That’s a very nice trick. Right. Moving to the actual point in the polygon algorithm. This is very simple. All you do is you take the point that you’re interested in and you take a ray in any direction. I’ve chosen just to follow the X-axis. And you count intersections between that ray and the polygon itself. If the number of intersections is odd, you’re inside the polygon. If it’s even, you’re not. Have a look at those two examples.