bitquill - Spiros Papadimitriou
Warning: Can't synchronize with the repository (Unsupported version control system "svn": "libdb3.so.3: cannot open shared object file: No such file or directory" ). Look in the Trac log for more information.

LOCI outlier detection

Under construction

LOCI is an outlier detection method based on local density, that we developed back in 2003 (more details).

The main idea is the following: Local data density around a point can be measured by the number of data points within a ball of given radius (i.e., fixed volume). A data point is deemed to be an outlier iff its local density is significantly different from the average density of it's neighbors. The key point is that "significantly different" is determined by comparing the relative difference in local density to the standard deviation of neighbor densities. Thus, rather than using a threshold expressed directly in terms of density, we use a threshold expressed relative to a statistical property which is also computed from the data. FIXME

Code (experimental)

If you wish to determine whether LOCI provides an outlier definition that is appropriate for your data, you can use the following simple implementation in Matlab (the original Python code is unfortunately defunct). However, this implementation (done in 2004, as a side-project for another paper that used similar concepts), has not been thoroughly tested. If (or rather, when?) you find any bugs, please let us know!

TODO put code

Remarks

TODO