Getting number of rows larger than MAX_INT from cursor in Python's psycopg2 with Amazon Redshift -
Getting number of rows larger than MAX_INT from cursor in Python's psycopg2 with Amazon Redshift -
i have started using python module psycopg2 work redshift database.
i have query inserts lot of rows (about 100 billions), , results of cursor not match:
cursor.execute("insert ...") status_msg = cursor.statusmessage row_count = cursor.rowcount logging.info("status_message='%s', row_count=%d" % (status_msg, row_count)) gives me:
>>> status_message='insert 0 100791203475', row_count=2006955667 for other big queries returns negative number. i'm pretty sure because number of rows more can fit in int, haven't found way prepare it.
i tried experimenting bit python's ints doesn't formatting here issue , more beingness returned psycopg2:
>>> print "%d" % int(100791203475) 100791203475 is there way create sure psycopg2 homecoming right number of rows? bug in psycopg2 module i'm starting think, or doing wrong end result?
i don't think it's bug in psycopg2 per se, much limitation way rowcount handled internally. psycopg2 implemented in c using python extensions, , type uses rowcount (signed) long.
how big long highly dependent on environment in compiled. based on results, suspect compiled in 32-bit environment, , hence long -2^31+1 2^31-1.
here code on github rowcount defined: https://github.com/psycopg/psycopg2/blob/56adc590fffbc76fa5e99aa64c657621a630cbe1/psycopg/cursor_type.c
if inclined may able compile in 64-bit environment long larger. see link installing source: http://initd.org/psycopg/docs/install.html#install-from-source
i'm not sure how reliable custom-compiled prove be, though.
my recommendation utilize rowcount informational purposes, not things need ironclad accuracy. in add-on overflow issues have been experiencing case, according doc, there other cases -1 returned in non-failure cases. (http://initd.org/psycopg/docs/cursor.html#cursor.rowcount)
instead, determine if operation failed (i.e. malformed query or fk constraint violation, or similar), trap exceptions. if there no exceptions, operation didn't fail (although have updated/inserted/deleted 0 rows).
python psycopg2 amazon-redshift integer-overflow
Comments
Post a Comment