Problem: Data has noise. Data sets must be cleansed, normalized and curated. But you never really can tell for sure which algorithms to apply unless you compare results.
Steps:
- Plan to generate different visualizations of your data set,
- Plan to create and various transforms on your data set
- Run through a number of typical algorithms on each view of your dataset.
Purpose: Clarify which data transformations will reveal the structure of the dataset that underlies your problem.