python - Efficient way to extract data within double quotes -
i need extract data within double quotes string.
input:
<a href="networking-denial-of-service.aspx">next page →</a>
output:
networking-denial-of-service.aspx
currently, using following method , running fine.
atag = '<a href="networking-denial-of-service.aspx">next page →</a>' start = 0 end = 0 in range(len(atag)): if atag[i] == '"' , start==0: start = elif atag[i] == '"' , end==0: end = nxtlink = atag[start+1:end]
so, question is there other efficient way task.
thankyou.
you tagged beautifulsoup don't see why want regex, if want href anchors can use css select 'a[href]'
find anchor tags have href attributes:
h = '''<a href="networking-denial-of-service.aspx">next page →</a>''' soup = beautifulsoup(h) print(soup.select_one('a[href]')["href"])
or find:
print(soup.find('a', href=true)["href"])
if have multiple:
for in soup.select_one('a[href]'): print a["href"]
or:
for in soup.find_all("a", href=true): print a["href"]
you specify want hrefs have leading ":
soup.select_one('a[href^="]')
Comments
Post a Comment