Think about the idea of applying Natural Language Algorithms such as minimum edit distance to compare “concept clouds” of various datasets.  We would need to either define a set of transformation rules analogous to the insert, delete, substitute set normally used or figure out how to apply these transformations.  This idea may or may not be useful!