python - Storing the unknown Id of an html tag -
python - Storing the unknown Id of an html tag -
so trying scrape html using beautifulsoup, having problems finding tag id using python 3.4. know tag ("tr")
is, id changing , save id when changes. example:
<div class = "thisclass" <table id = "thistable"> <tbody> <tr id="what want"> <td class = "someinfo"> <tbody> <table> <div>
i can find div
tag , table
, , know tr
tag there, want extract text
next id
, without knowing text
going say.
so far have code:
soup = beautifulsoup(url.read()) divtag = soup.find_all("table",id ="thistable") = 0 in divtag: trtag = soup.find("tr", id) print(trtag) = i+1
if help me solve problem appreciate it.
you can utilize css selector
:
print([element.get('id') element in soup.select('table#thistable tr[id]'))
python html tags web-scraping beautifulsoup
Comments
Post a Comment