python - Using Requests and lxml, get href values for rows in a table -


python 3

i having hard time iterating through rows of table.

how iterate tr[1] component through number of rows in table body teamname, teamstate, teamlink xpaths?

import lxml.html lxml.etree import xpath url = "http://www.maxpreps.com/rankings/basketball-winter-15-16/7/national.htm"  rows_xpath = xpath('//*[@id="rankings"]/tbody) teamname_xpath = xpath('//*[@id="rankings"]/tbody/tr[1]/th/a/text()') teamstate_xpath = xpath('//*[@id="rankings"]/tbody/tr[1]/td[2]/text()') teamlink_xpath = xpath('//*[@id="rankings"]/tbody/tr[1]/th/a/@href')  html = lxml.html.parse(url)  row in rows_xpath(html):     teamname = teamname_xpath(row)     teamstate = teamstate_xpath(row)     teamlink = teamlink_xpath(row)     print (teamname, teamlink) 

i have attempted through following:

from lxml import html import requests  siteitem = ['http://www.maxpreps.com/rankings/basketball-winter-15-16/7/national.htm'             ]  def linkscrape():     page = requests.get(target)     tree = html.fromstring(page.content)  #get team link     link in tree.xpath('//*[@id="rankings"]/tbody/tr[1]/th/a/@href'):         print (link) #get team name             name in tree.xpath('//*[@id="rankings"]/tbody/tr[1]/th/a/text()'):         print (name) #get team state             state in tree.xpath('//*[@id="rankings"]/tbody/tr[1]/td[2]/text()'):         print (state)  target in siteitem:     linkscrape() 

thank looking :d

if understand you're asking, want iterate on rows in ranking table. so, start loop on rows:

import lxml.html doc = lxml.html.parse('http://www.maxpreps.com/rankings/basketball-winter-15-16/7/national.htm')  row in doc.xpath('//table[@id="rankings"]/tbody/tr'): 

this iterate on each row in document. now, each row, extract data want:

    team_link = row.xpath('th/a/@href')[0]     team_name = row.xpath('th/a/text()')[0]     team_state = row.xpath('td[contains(@class, "state")]/text()')[0]     print(team_state, team_name, team_link) 

which on system yields output along lines of:

ca manteca /high-schools/manteca-buffaloes-(manteca,ca)/basketball-winter-15-16/rankings.htm md mount st. joseph (baltimore) /high-schools/mount-st-joseph-gaels-(baltimore,md)/basketball-winter-15-16/rankings.htm tx brandeis (san antonio) /high-schools/brandeis-broncos-(san-antonio,tx)/basketball-winter-15-16/rankings.htm 

Comments

Popular posts from this blog

java - SSE Emitter : Manage timeouts and complete() -

jquery - uncaught exception: DataTables Editor - remote hosting of code not allowed -

java - How to resolve error - package com.squareup.okhttp3 doesn't exist? -