9/17/2023 0 Comments Psotove linear scatter plot![]() The data that I was working with was normally distributed. There are many different kinds of 2D distributions. Preliminaries Bivariate normal distributions If one bases the effort solely on LOC, the original polygon method required more than 30× the effort to the Wasserstein metric.Ī more detailed breakdown is available at the end in the Code analysis section.įor those who just want to compare scatter plots the right way, jump to How to compare scatter plots. The above table shows where the code can be viewed, the time complexity and the approximate number lines of code (LOC). Gnimuc/Hungarian.jl, /LiorSinai/AssignmentProblem.jl Instead I provide a high level overview and references are given for further information. ![]() There are too many algorithms to go into proper detail for each one. This post is the outcome of that investigation. We also brainstormed using circles and ellipses. How much more difficult was it to do the polygon method? Later in my own capacity I challenged myself to try the other methods. (Users weren’t shown the scores just similar plots.) Thankfully the client was open to my suggestions and we did implement the Wasserstein metric. However the other two points remain valid. I’ve crossed out the third point because the convex hull method can be made fast. The convex hull method would be much slower than the Wasserstein metric.It would take too much development time and this was only one of many tasks.For example, it does not handle outliers well. The statistical properties of the polygon method are not adequate.A comparison of different similarity metrics for 2D scatter plots. ![]() In more technical terms, they wanted to find convex hulls of the points and then calculate the areas of intersection between them. They wanted to enclose the points in shapes and then calculate the overlap of those shapes. Unfortunately for me, the client came with a solution in mind already. This is the “statistical distance” in the widget. I was drawn to the Wasserstein metric which is used in the popular Fréchet Inception Distance (FID) in machine learning. ( Source code.)įurthermore, there are standard statistical techniques for this sort of problem. ![]() The above is a demo of the problem at hand. It is an unusual request but not unheard of - see this question or this one. A while back I was given an intriguing task: rank scatter plots by similarity. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |