python - Getting size of pdf without downloading -
python - Getting size of pdf without downloading -
is possible know size of pdf e.g. http://example.com/abc.pdf using requests module in python without downloading it. writing application if net speed slow , if size of pdf big postpone download future
use http-head request
response shall provide in headers more details of file download without fetching total file.
>>> url = "http://www.pdf995.com/samples/pdf.pdf" >>> req = requests.head(url) >>> req.content '' >>> req.headers["content-length"] '433994'
or seek streaming read >>> req = requests.get(url, stream=true) >>> res = req.iter_content(30) >>> res <generator object generate @ 0x7f9ad3270320> >>> res.next() '%pdf-1.3\n%\xc7\xec\x8f\xa2\n30 0 obj\n<</len' >>> res.next() 'gth 31 0 r/filter /flatedecode' >>> res.next() '>>\nstream\nx\x9c\xed}\xdd\x93%\xb7m\xef\xfb\xfc\x15s\xf7%nu\xf6\xb8'
you can decode pdf size initial pdf file bytes , decide go on or not.
use range request headerhttp allows asking retrieval range of bytes.
if server supports that, can utilize trick, inquire range of bytes available big files. if bytes (and status ok), know, file large.
if exception chunkedencodingerror: incompleteread(0 bytes read)
, know, file smaller.
call this:
>>> headers = {"range": "bytes=999500-999600"} >>> req = requests.get(url, headers=headers)
this work only, if server allows serving partial content.
python http-headers request
Comments
Post a Comment