urllib2 - Python - HTTPRedirectHandler doesnt always get redirects -
urllib2 - Python - HTTPRedirectHandler doesnt always get redirects -
i using urllib2's httpredirecthandler redirects of url. code looks this:
import urllib2, cookielib class httpredirecthandler(urllib2.httpredirecthandler): def redirect_request(self, req, fp, code, msg, headers, newurl): newreq = urllib2.httpredirecthandler.redirect_request(self, req, fp, code, msg, headers, newurl) if newreq not none: self.redirections.append(newreq.get_full_url()) homecoming newreq def getlistofredirecturls(adurl): urllist = [] h = httpredirecthandler() h.max_redirections = 100 h.redirections = [adurl] opener = urllib2.build_opener(h) response = opener.open(adurl) redirect in h.redirections: urllist.append(redirect) homecoming urllist
this works great bulk of urls. however, time time, gives me , first url , not final page (or in between). example, advertisement link :
'http://trc.taboola.com/wtffunfact/log/3/click?pi=/&ri=4c376c94a4413aaace150a3cd2b9d902&sd=v2_b935da3ff37d8b8efc9f1aeaf46df13b_3932b8b5d26526ed68d78d4ca0b1c52d_1403183649_1403183649_cn6aaq&ui=3932b8b5d26526ed68d78d4ca0b1c52d&it=video&ii=~~v1~~-2280573428682392386~~uno41ain9fs4v6klu4hrenh33hc7jimtl3zkewlhbjlt20wtoflyybfp19r7bzdgodxni1tdh9vjwwb-9gbnka&pt=home&li=rbox-h2v&redir=http://livingwelltrends.com/3/?src=tabn&query=ad3&net=desktop&ad=food7&utm_source=taboola&utm_medium=referral&utm_content=wtffunfact&p=jmedia-sc&vi=1403183649792&r=2'
just returns url, when paste url in browser, sends me page. how final page programatically? tried using urlopen
var = urllib2.urlopen(url) print var.geturl()
but doesn't give final landing page well. redirect handler works urls dont know issue ones don't. have idea?
python urllib2
Comments
Post a Comment