Data manipulation
In order to carry out an accurate comparison of the sentiment words and the retrieved tweets some move around of data had to be done.
In order to check each tweet with every sentiment word I extracted the data as follows:
The sentiment table was split into two array lists, one for the word and one for the corresponding polarity of that word. Due to this (and each array being indexed at zero), it can be said that, for example, the word at index position 5 of the word array has the polarity at index 5 of the polarity array.
As for the Twitter data I split each tweets main keys into separate elements in an array, i.e. ID at index 0, Tweet Date at index 1, Tweet Text at index 2 etc. With each iteration through the code (once the Tweet has been compared with each sentiment word) the array is cleared for the Twitter data and replace with the next Tweet data.
This process is repeated until all Tweets have been checked against the word and therefore the polarity arrays. If the Tweet has a match for a sentiment word, that word’s polarity is checked. If it is positive the sentiment value increases by 1 and if negative it decreases by 1.
Once the process is complete the sentiment value is populated back into the database using the ID of the Tweet data which can be obtained from the array of Twitter data at index 0.
If the sentiment value is greater than 0 it is considered a positive tweet, if less than 0 it is considered a negative tweet and if it is 0 it is neutral.
Some hand written notes explaining the process of the data being split can be seen below:














