Most data analysts expect to spend more time cleaning and formatting their data then they end up spending in data display and analysis. Data cleaning is an art as well as a skill; the paths you follow may differ for each data set challenge. Fortunately, the Data desk interface facilitates paths through data cleaning and Data desk provides a number of tools that prove invaluable for the task.
Books about Data Science usually discuss basic data manipulation functions. These are most often called
Append,
Join or Merge,
Dedupe,
Filter,
Aggregate, and
Feature Creation
In the following postings, I’ll show how to accomplish these functions in Data Desk. As you might imagine, nobody does it like DD. And, as you should expect, DD keeps you closer to your data with better opportunities to detect outliers or other anomalies that you certainly want to know about.