Update 4
The API Works!
This week I managed to get my API to work and post the data I scraped. I am using the Django Rest-Framework to structure my API. Within the framework I created a table model to easily sort/modify the data I upload into the API. During the implementation stage, I got help from ChatGPT to structure my urls and views class. I also learned more about serialization. I created a serializers class that converts complex data types to Python objects which is then processed as json objects.
For this week, I aim to deploy my API on a server and have it global. I am looking into AWS and Microsoft Azure to see which one would be better in terms of the model I want to represent.
On the other hand, I am also researching what kind of analysis I want to conduct on my data. I have previously suggested that I can make a topic modeling, however thinking more about it I don’t think it will highlight the differences we want to see clearly. Instead of topic modeling, I could do a sentiment analysis/opinion mining. Opinion mining is very similar to sentiment analysis, however it is conducted on specific pre-identified feautures of the model.
The steps of applying opinion mining to the data would be:
- Cleaning the data of stopwords, punctuation, converting to lower case, and tokenizing
- Extracting Features: this would have to be manually done by identifying key terms I would want to look at that are common in the data. I could do a frequency model for the most common words that appear in my data to render this process easier.
- Conducting sentiment analysis. One handicap with this part is that my data is in Turkish. I have found sentiment analysis models trained on Turkish corpus like textblob-tr. This library is not developed by text blob, so the accuracy is not guaranteed. SentiTurkNet is also another library that is developed off of Turkish corpus.
- Opinion aggregation: this step is basically combining the scores of each feature found in the previous step to gain a total score for the review.
- Visualization
I am hoping to achieve the first two steps by next week!