python - pandas timeseries resampling ending a given day -



python - pandas timeseries resampling ending a given day -

i guess many people working on timeseries have come through issue, , pandas doesn't seem provide straightforward solution (yet !)

suppose have timeseries of daily info close prices. indexed date (day). today 19jun. lastly close 18jun. want resample , have ohlc, give frequency (let's m or 2m) buckets ending 18jun. m freq, lastly bucket 19may-18jun, previous 1 19apr-18may, , on...

ts.resample('m', how='ohlc')

will resampling, 'm' 'end_of_month' period result give total month 2014-05 , 2 weeks period 2014-06, lastly bar won't 'monthly bar'. that's not want on 2m frequency, given sample timeseries, test gives me lastly bucket labelled 2014-07-31 (and previous lastly labelled 2014-05-31), it's quite misleading since there's not info on jul... supposed lastly 2months bucket 1 time again 2 weeks one.

the right datetimeindex easely created :

pandas.date_range(end='2014-06-18', freq='2m', periods=300) + datetime.timedelta(days=18)

(note pandas documentation prefers same thing with

pandas.date_range(end='2014-06-18', freq='2m', periods=300) + pandas.tseries.offsets.dateoffset(days=18)

but tests shows method, though more 'pandaïc' 2x slower)

but far can't apply right datetimeindex ts.resample()

seems pandas dev team (date ranges in pandas) aware of issue, in meantime, how solve ohlc rolling frequency anchored on lastly day ?

anyway, big pandas/numpy dev teams, amazing tools.

this hacked copy/paste, , i'm sure fails on cases - below starting code custom offset anchored particular day in month.

from pandas.tseries.offsets import (as_datetime, as_timestamp, apply_nat, dateoffset, relativedelta, datetime) class monthanchor(dateoffset): """dateoffset anchored day in month arguments: day_anchor: day anchored """ def __init__(self, n=1, **kwds): super(monthanchor, self).__init__(n) self.kwds = kwds self._dayanchor = self.kwds['day_anchor'] @apply_nat def apply(self, other): n = self.n if other.day > self._dayanchor , n <= 0: # roll forwards if n<=0 n += 1 elif other.day < self._dayanchor , n > 0: n -= 1 other = as_datetime(other) + relativedelta(months=n) other = datetime(other.year, other.month, self._dayanchor) homecoming as_timestamp(other) def onoffset(self, dt): homecoming dt.day == self._dayanchor _prefix = ''

example usage:

in [28]: df = pd.dataframe(data=np.linspace(50, 100, 200), index=pd.date_range(end='2014-06-18', periods=200), columns=['value']) in [29]: df.head() out[29]: value 2013-12-01 50.000000 2013-12-02 50.251256 2013-12-03 50.502513 2013-12-04 50.753769 2013-12-05 51.005025 in [61]: month_offset = monthanchor(day_anchor = df.index[-1].day + 1) in [62]: df.resample(month_offset, how='ohlc') out[62]: value open high low close 2013-11-19 50.000000 54.271357 50.000000 54.271357 2013-12-19 54.522613 62.060302 54.522613 62.060302 2014-01-19 62.311558 69.849246 62.311558 69.849246 2014-02-19 70.100503 76.884422 70.100503 76.884422 2014-03-19 77.135678 84.673367 77.135678 84.673367 2014-04-19 84.924623 92.211055 84.924623 92.211055 2014-05-19 92.462312 100.000000 92.462312 100.000000

python pandas time-series

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

c# - Create a Notification Object (Email or Page) At Run Time -- Dependency Injection or Factory -

Set Up Of Common Name Of SSL Certificate To Protect Plesk Panel -