python - Decompressing a text file -


so have compressed text need decompress able recreate text.

the compression :

import zlib, base64  text = raw_input("enter sentence: ")#asks user input text text = text.split()#splits sentence  uniquewords = [] #creates empty array  word in text: #loop following      if word not in uniquewords: #if word not in uniquewords          uniquewords.append(word) #it adds word empty array  positions = [uniquewords.index(word) word in text] #finds positions of each uniqueword positions2 = [x+1 x in positions] #adds 1 each position print ("the uniquewords , positions of words are: ") #prints uniquewords , positions print uniquewords  print positions2  file = open('task3file.txt', 'w') file.write('\n'.join(uniquewords))#adds uniquewords file file.write('\n') file.write('\n'.join([str(p) p in positions2])) file.close()  file = open('compressedtext.txt', 'w')  text = ', '.join(text)  compression =  base64.b64encode(zlib.compress(text,9))  file.write('\n'.join(compression))  print compression  file.close() 

my attempt @ decompression is:

import zlib, base64  text = ('compressedtext.txt')  file = open('compressedtext.txt', 'r')  print ("in file is: \n") + file.read()  text = ''.join(text) data = zlib.decompress(base64.b64decode(text))  recreated = " ".join([uniquewords[word] word in positions]) #recreates sentence  file.close() #closes file  print ("the sentences recreated: \n") + recreated  

but when run decompression , try recreate original text error message appears saying

file "c:\python27\lib\base64.py", line 77, in b64decode raise typeerror(msg) typeerror: incorrect padding

does know how fix error?

there few things going on here. let me start giving working sample:

import zlib, base64  rawtext = raw_input("enter sentence: ")  # asks user input text text = rawtext.split()  # splits sentence  uniquewords = []  # creates empty array word in text:  # loop following     if word not in uniquewords:  # if word not in uniquewords         uniquewords.append(word)  # adds word empty array  positions = [uniquewords.index(word) word in text]  # finds positions of each uniqueword positions2 = [x+1 x in positions]  # adds 1 each position print ("the uniquewords , positions of words are: ")  # prints uniquewords , positions print uniquewords print positions2  infile = open('task3file.txt', 'w') infile.write('\n'.join(uniquewords))  # adds uniquewords file infile.write('\n') infile.write('\n'.join([str(p) p in positions2])) infile.close()  infile = open('compressedtext.b2', 'w')  compression = base64.b64encode(zlib.compress(rawtext, 9))  infile.write(compression)  print compression  infile.close()  # read again  infile = open('compressedtext.b2', 'r') text = infile.read() print("in file is: " + text) recreated = zlib.decompress(base64.b64decode(text)) infile.close() print("the sentences recreated:\n" + recreated) 

i've tried keep things pretty close had, note in particular few changes:

  • i'm trying more track raw text versus processed text.

  • i've removed redefinition of zlib.

  • i've removed line breaks break decompression.

  • i've done general clean-up better conform normal python conventions.

hope helps.


Comments

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

windows - Debug iNetMgr.exe unhandle exception System.Management.Automation.CmdletInvocationException -

configurationsection - activeMq-5.13.3 setup configurations for wildfly 10.0.0 -