python - Manipulate Pandas DataFrame containing dictionaries from Twitter API -

i working on script uses twitter api pull recent statuses list of users. able retrieve info using api upon converting dataframe, columns storing dictionaries. want spread indexes of dictionaries additional columns. ultimately, trying save info csv.

here code:

import twython import time import pandas pd import numpy np  app_key = '' app_secret = '' oauth_token = '' oauth_token_secret = ''  twitter = twython.twython(app_key, app_secret, oauth_token, oauth_token_secret)  screen_names = ['@', '@'] #enter screen names of  involvement  tweets = []  screen_name in screen_names:     tweets.extend(twitter.get_user_timeline(screen_name=screen_name, count=200))     time.sleep(5)  df = pd.dataframe(tweets)

which returns dataframe (400,25). df[[2,3,5]] returns following:

     created_at                       entities                                         favorite_count 0    thu jun 19 13:14:39 +0000 2014  {u'symbols': [], u'user_mentions': [], u'hasht...       0 1    thu jun 19 11:53:51 +0000 2014  {u'symbols': [], u'user_mentions': [{u'id': 18...       0 2    thu jun 19 11:53:25 +0000 2014  {u'symbols': [], u'user_mentions': [], u'hasht...       3 3    thu jun 19 11:49:34 +0000 2014  {u'symbols': [], u'user_mentions': [], u'hasht...       0 4    thu jun 19 11:01:31 +0000 2014  {u'symbols': [], u'user_mentions': [{u'id': 18...       0

how split entities column across additional columns? example, i'd symbols, user_mentions, hastags, etc. become additional columns in df.

any help appreciated.

i utilize helper function convert dict of nested values (likely api) dict without nested values.

def flatten(d):     key in d.keys():         if isinstance(d[key], list):             value = d.pop(key)             i, v in enumerate(value):                 d.update(flatten({'%s__%s' % (key, i): v}))         elif isinstance(d[key], dict):             value = d.pop(key)             d.update([('%s__%s' % (key, sub), v) (sub, v) in flatten(value).items()])      homecoming d

here illustration of does:

in [2]: d = {'user': 'foo', 'data': {'choices': [0,1,2], 'type': 'x1'}}  in [3]: flatten(d) out[3]:  {'data__choices__0': 0,  'data__choices__1': 1,  'data__choices__2': 2,  'data__type': 'x1',  'user': 'foo'}

in example, need do:

df = pd.dataframe([flatten(t) t in tweets])

python twitter pandas twython

Search This Blog

Three

python - Manipulate Pandas DataFrame containing dictionaries from Twitter API -

Comments

Post a Comment

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

ruby on rails - Devise Logout Error in RoR -

c# - Create a Notification Object (Email or Page) At Run Time -- Dependency Injection or Factory -