python - Manipulate Pandas DataFrame containing dictionaries from Twitter API -



python - Manipulate Pandas DataFrame containing dictionaries from Twitter API -

i working on script uses twitter api pull recent statuses list of users. able retrieve info using api upon converting dataframe, columns storing dictionaries. want spread indexes of dictionaries additional columns. ultimately, trying save info csv.

here code:

import twython import time import pandas pd import numpy np app_key = '' app_secret = '' oauth_token = '' oauth_token_secret = '' twitter = twython.twython(app_key, app_secret, oauth_token, oauth_token_secret) screen_names = ['@', '@'] #enter screen names of involvement tweets = [] screen_name in screen_names: tweets.extend(twitter.get_user_timeline(screen_name=screen_name, count=200)) time.sleep(5) df = pd.dataframe(tweets)

which returns dataframe (400,25). df[[2,3,5]] returns following:

created_at entities favorite_count 0 thu jun 19 13:14:39 +0000 2014 {u'symbols': [], u'user_mentions': [], u'hasht... 0 1 thu jun 19 11:53:51 +0000 2014 {u'symbols': [], u'user_mentions': [{u'id': 18... 0 2 thu jun 19 11:53:25 +0000 2014 {u'symbols': [], u'user_mentions': [], u'hasht... 3 3 thu jun 19 11:49:34 +0000 2014 {u'symbols': [], u'user_mentions': [], u'hasht... 0 4 thu jun 19 11:01:31 +0000 2014 {u'symbols': [], u'user_mentions': [{u'id': 18... 0

how split entities column across additional columns? example, i'd symbols, user_mentions, hastags, etc. become additional columns in df.

any help appreciated.

i utilize helper function convert dict of nested values (likely api) dict without nested values.

def flatten(d): key in d.keys(): if isinstance(d[key], list): value = d.pop(key) i, v in enumerate(value): d.update(flatten({'%s__%s' % (key, i): v})) elif isinstance(d[key], dict): value = d.pop(key) d.update([('%s__%s' % (key, sub), v) (sub, v) in flatten(value).items()]) homecoming d

here illustration of does:

in [2]: d = {'user': 'foo', 'data': {'choices': [0,1,2], 'type': 'x1'}} in [3]: flatten(d) out[3]: {'data__choices__0': 0, 'data__choices__1': 1, 'data__choices__2': 2, 'data__type': 'x1', 'user': 'foo'}

in example, need do:

df = pd.dataframe([flatten(t) t in tweets])

python twitter pandas twython

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -