python - Group data by month and year -



python - Group data by month and year -

i have .json file containing lot of articles, each article formatted this:

{ "source": "....", "title": ".......", "original_time": "ora: 20:03, 06 dec 2006", "datetime": "2006-12-06t20:03:00+00:00", "views": 398, "comments": 1, "content": "..." "id": "13", }

now have sum numbers of views of articles each month , year , plot results...but don't know how because i'm new python...this have done:

import json #from pprint import pprint import csv import time import datetime views = [] time = [] art_timpul = 0 unimedia = 0 total_articles = 0 json_data=open('all.json') info = json.load(json_data) #pprint(data) json_data.close() in data: if i["source"] == 'unimedia': art_unimedia += 1 x = i["views"] views.append(int(x)) y = i["original_time"] time.append(y) if i["source"] == 'timpul': art_timpul += 1 total_articles += 1 myfile = open('output.csv', 'wb') wr = csv.writer(myfile, quoting=csv.quote_all) wr.writerow(views) print time #print views print "articles unimedia", art_unimedia print "articles timpul", art_timpul print "total articles", total_articles

edit: have grouping info month , year, have sum nr of views articles written in month , year...and export them file

not exclusively clear question, i'll assume not have problem reading , writing files, parsing date string , grouping data.

first, parsing date. here can utilize e.g. dateutil.parser.parse or time.strptime. dateutil.parser seems expect date format yours default, we'll utilize instead of configuring format strptime.

next, grouping: easiest utilize number of dictionaries mapping months or years views. utilize dictionary different sources, instead of 2 variables have now. utilize month or year key dictionary , update value accordingly. create life bit easier, can utilize collections.defaultdict, don't have check whether key exists.

example grouping month (similar year , source etc. in same loop):

import collections, dateutil.parser views_by_month = collections.defaultdict(int) item in data: views = item["views"] date = dateutil.parser.parse(item["datetime"]) views_by_month[date.month] += views print views_by_month

python

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -