The General Gist:

So here's the general gist of the program:

1) We read in an xml file using the fileReader provide by Prefuse

2) We gather the information about the nodes, normalize the data and create instance(s) with the doubles

3) A user moves two nodes. A constraint of "same" or "not same" is placed on those two nodes depending on the distance between them.

4) PCKMeans is called using an ArrayList of said constraints and the Instances comprised of the instance(s) and it returns the instances in the requested number of clusters.

5) We create nodes for the cluster centers, and then attach edges from the centers to the associated nodes.

6) Remove cluster centers and edges the next time the user moves a node.

7) Repeat steps 3 through 6 until satisfied.

The idea is to incorporate user feedback into the clustering. The clustering algorithm can go against the user-created constraints, but only if it really, really wants to, and even then it invokes a penalty. So if the user screws up entirely, the program will know. By using user feedback, it should take less time to arrive at the correct cluster by manually moving the nodes.

We'll be running some experiments, and I'll post the results of those up here, as well a our final paper and my final report.

Clustering Algorithms