calculating a word sentiment polarity

I have a list of words (more than 2000) which were rated by 4 raters for sentiments like anger (scale from 1 to 5). I run an alpha test on sass for inter-rater reliability, which gives me a coefficient of >8 which shows a good reliability. My question is, when I conduct a sentiment analysis, would it be right to use the mean of the 4 ratings in order to define a word polarity?
This seems more like a theoretical question than a statistical one, I would advice you to look at papers with similar methods and see what they did there.
Gut feeling for me is that if ratings are reliable, you can take the mean, but I don't know what common practice is in your field.

