Hi all,
I am looking for ways of grouping a long list of linguistic and non-linguistic features of words.
For example, the 3 different features: (1) number of letters in a word, (2) words coverage in the 1000 most frequently used English word list, (3) number of senses of a word (polysemy) could be measured to reveal ‘Word difficulty’. Having more than 200 features, I need to group them in some way to help me interpret them as a group, but not as individuals.
Previous literature investigated, e.g. ‘difficulty’, ‘complexity’, ‘formality’, ‘sophistication’ through several measures. But, these terms were actually studied in individual papers and many of them overlap.
I think I am having 2 options to begin with. First, I can look at each of the features and make a list of the terms that the features could reflect. The problem is that one feature also reveals various aspects. So, the second option, having an initial list of terms and placing the features under the terms seem to be more practical. If you know any resources for a single common category of features or ideas on how should I choose the umbrella terms for features, please let me know. Thank you so much.