Finally Weka!

It took so long to get Weka to work the way I wanted it to, but now it seems simpler than I thought. Before Friday, I needed to run ablation experiments to determine the importance of certain features in the baseline system. Saif Mohammad tested with word ngrams (WN), character ngrams (CN), word embeddings (WE), and lexicons (L); he used 11 lexicons, I will only test on 10 as I don’t have access to one of them.

Back in Weka, the way to correctly run models is to keep the preprocessing filter set to “None”, and then coming back to the classification tab to use the correct filter set and classifier. The configuration below describes one of the tests I ran.

weka.filters.MultiFilter -F "weka.filters.unsupervised.attribute.TweetToEmbeddingsFeatureVector -I 2 -B C:/Users/amanj/wekafiles/packages/AffectiveTweets/resources/w2v.twitter.edinburgh.100d.csv.gz -S 0 -K 15 -L -O" -F "weka.filters.unsupervised.attribute.TweetToLexiconFeatureVector -I 2 -A -D -F -H -J -L -N -P -Q -R -T -U -O" -F "weka.filters.unsupervised.attribute.TweetToSentiStrengthFeatureVector -I 2 -U -O" -F "weka.filters.unsupervised.attribute.TweetToSparseFeatureVector -M 0 -I 1 -Q 4 -D 3 -E 5 -L -F -G 0 -I 0" -F "weka.filters.unsupervised.attribute.Reorder -R 5-last,4"

This test used word embeddings and all the lexicons (SentiStrength is a lexicon that is it’s own filter), and use the LibLINEAR classifier with L2 regularized and L2 loss. I ran the other tests the same way, by removing all lexicons except for the one I was testing for, or only using the word embeddings filter.

After running the tests, the ablation experiments were was mostly done.AblationTable1
Note: The scores calculated are the Pearson correlation scores.

Tomorrow I will finish running experiments with word ngrams and character ngrams, and then running those filters with word embeddings or lexicons. Lastly, there will be the final model which uses all the filters to look at how the model does when using all the features.

I also finished reading the EmoInt paper. It made me familiar with their process, and furthermore gave me ideas for applications for the regression/deep learning model I will create.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s