LOCI outlier detection
Under construction
LOCI is an outlier detection method based on local density, that we developed back in 2003 (more details).
The main idea is the following: Local data density around a point can be measured by the number of data points within a ball of given radius (i.e., fixed volume). A data point is deemed to be an outlier iff its local density is significantly different from the average density of it's neighbors. The key point is that "significantly different" is determined by comparing the relative difference in local density to the standard deviation of neighbor densities. Thus, rather than using a threshold expressed directly in terms of density, we use a threshold expressed relative to a statistical property which is also computed from the data. FIXME
Code (experimental)
If you wish to determine whether LOCI provides an outlier definition that is appropriate for your data, you can use the following simple implementation in Matlab (the original Python code is unfortunately defunct). However, this implementation (done in 2004, as a side-project for another paper that used similar concepts), has not been thoroughly tested. If (or rather, when?) you find any bugs, please let us know!
TODO put code
Remarks
TODO
