python - Efficient way to extract data within double quotes -


i need extract data within double quotes string.

input:

<a href="networking-denial-of-service.aspx">next page →</a> 

output:

networking-denial-of-service.aspx 

currently, using following method , running fine.

atag = '<a href="networking-denial-of-service.aspx">next page →</a>' start = 0 end = 0  in range(len(atag)):     if atag[i] == '"' , start==0:         start =     elif atag[i] == '"' , end==0:          end =  nxtlink = atag[start+1:end] 

so, question is there other efficient way task.

thankyou.

you tagged beautifulsoup don't see why want regex, if want href anchors can use css select 'a[href]' find anchor tags have href attributes:

h = '''<a href="networking-denial-of-service.aspx">next page →</a>'''  soup = beautifulsoup(h)  print(soup.select_one('a[href]')["href"]) 

or find:

 print(soup.find('a', href=true)["href"]) 

if have multiple:

for  in soup.select_one('a[href]'):     print a["href"] 

or:

for  in  soup.find_all("a", href=true):      print a["href"] 

you specify want hrefs have leading ":

 soup.select_one('a[href^="]')  

Comments

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

windows - Debug iNetMgr.exe unhandle exception System.Management.Automation.CmdletInvocationException -

configurationsection - activeMq-5.13.3 setup configurations for wildfly 10.0.0 -