The main thing I had to do today was to get the word ngrams and character ngrams feature set to work to finish the ablation experiments. While I wasn’t able to get it to work in the past, I realized the problem almost immediately today.
The index it was trying to filter was set to 3 (as default), which means the only word it would ever be trying to filter into different word ngrams and character ngrams would be the emotion. By changing the index to 2, it would then filter the tweet.
The final ablation table
As shown by the table, the overall best performing combination was word embeddings with all the lexicons. Using the word ngrams and character ngrams alongside the other filters proved to be a little less accurate, possibly due to overfitting.