1 minute read

Update Week 3

Formatting the data

Tools used for formatting the data:

  1. pandas DataFrame

To store the data in an efficient way and easily visualize it, I utilized the pandas package for python. However, before storing the data in a dataframe format, I had to convert the dates of the reviews from Turkish to English. Since there weren’t any libraries within python for this conversion, I came up with a hard coded converter that I am hoping to improve.

def conv_date(ad_date):
    new_date = ""
    if ad_date[1] == 'Ocak':
        new_date = "01/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Şubat':
        new_date = "02/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Mart':
        new_date = "03/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Nisan':
        new_date = "04/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Mayıs':
        new_date = "05/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Haziran':
        new_date = "06/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Temmuz':
        new_date = "07/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Ağustos':
        new_date = "08/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Eylül':
        new_date = "09/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Ekim':
        new_date = "10/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Kasım':
        new_date = "11/" + ad_date[0] + "/" + ad_date[2]
    elif ad_date[1] == 'Aralık':
        new_date = "12/" + ad_date[0] + "/" + ad_date[2]
    return new_date

After this conversion, I was able to store the data in the pandas dataframe. An advantage of the pandas package is its easy exportation to other file formats such as csv and excel. Since the reviews that I am storing are at most 300 entries, it doesn’t require me to store it as a csv. Hence, I exported my data to Excel files for easy readability.

Updated: