regex - python script to carve up csv data -
i have csv data looks this:
724 "overall evaluation: 2 invite interview: 2 strength or novelty of idea (1): 3 strength or novelty of idea (2): 3 strength or novelty of idea (3): 2 use or provision of open data (1): 3 use or provision of open data (2): 3 ""open default"" (1): 2 ""open default"" (2): 2 value proposition , potential scale (1): 3 value proposition , potential scale (2): 4 market opportunity , timing (1): 3 market opportunity , timing (2): 4 triple bottom line impact (1): 4 triple bottom line impact (2): 3 triple bottom line impact (3): 2 knowledge , skills of team (1): 4 knowledge , skills of team (2): 4 capacity realise idea (1): 4 capacity realise idea (2): 4 capacity realise idea (3): 3 appropriateness of budget realise idea: 4" 724 "overall evaluation: 1 invite interview: 1 strength or novelty of idea (1): 2 strength or novelty of idea (2): 2 strength or novelty of idea (3): 3 use or provision of open data (1): 2 use or provision of open data (2): 2 ""open default"" (1): 3 ""open default"" (2): 3 value proposition , potential scale (1): 2 value proposition , potential scale (2): 2 market opportunity , timing (1): 2 market opportunity , timing (2): 2 triple bottom line impact (1): 2 triple bottom line impact (2): 2 triple bottom line impact (3): 1 knowledge , skills of team (1): 4 knowledge , skills of team (2): 2 capacity realise idea (1): 2 capacity realise idea (2): 2 capacity realise idea (3): 1 appropriateness of budget realise idea: 3"
using python , regex, possible identify every instance of words "overall evaluation:
, log number, in example 724
, value comes after"overall evaluation:
, i.e. 2
, such left with:
724, 2 724, 1
for instance.
if so, how implement such logic?
i tried this:
f=open("1.txt",'r').read().splitlines() head='0' body=[] x in f: if x=="\n" or x.strip()=='': continue try: int(x[0]) print(head +':'+'+'.join(body)) tmp=x.split() head=tmp[0]+'-'+tmp[1] body=[tmp[4]] except valueerror e: body.append(x.split(':')[1].strip().strip('\"')) print(head +':'+'+'.join(body))
but didn't work :/
that should work:
lines=open("1.txt",'r').read().splitlines() l in lines: data = l.split(' "overall evaluation: ') if len(data) == 2: print(data[0] + ", " + data[1])
the split function use string "overall evaluation:
seperator
Comments
Post a Comment