python - Sort & Unique vs Set -

- February 15, 2010

in python 2.7, in order retrieve set of unique strings redundant list of strings, preferred (~10 million strings of length ~20):

a) sort list , delete repeating strings

sort(l) unique(l) #some linear time function

b) put them in set

set(l)

note not care order of strings.

i made simple test check running time of both solutions. first test creates set , second test sorts list (it doesn't remove duplicates sake of simplicity).

as expected creating set faster sorting, since complexity o(n) while sorting o(nlogn).

import random import string import time   def random_str():     size = random.randint(10, 20)     chars = string.ascii_letters + string.digits     return ''.join(random.choice(chars) _ in range(size))   l = [random_str() _ in xrange(1000000)]  t1 = time.clock() in range(10):     set(l) t2 = time.clock() print(round(t2-t1, 3))  t1 = time.clock() in range(10):     sorted(l) t2 = time.clock() print(round(t2-t1, 3))

the output got:

2.77 11.83

Search This Blog

M16

python - Sort & Unique vs Set -

Comments

Post a Comment

Popular posts from this blog

iis - ASP.Net Core CreatedAtAction in HttpPost action returns 201 but entire request ends with 500 -

gcc - Neither ld wrap nor LD_PRELOAD working to intercept system call -

ssh - Vagrant Windows - ssh_exchange_identification: read: Connection reset by peer -