Union of multiple sets in python

ID : 131381

viewed : 7

Tags : pythonpython-3.xlistpython

Top 5 Answer for Union of multiple sets in python

vote vote

96

The itertools module makes short work of this problem:

>>> from itertools import chain >>> list(set(chain.from_iterable(d))) [1, '41', '42', '43', '40', '34', '30', '44'] 

Another way to do it is to unpack the list into separate arguments for union():

>>> list(set().union(*d)) [1, '41', '42', '43', '40', '34', '30', '44'] 

The latter way eliminates all duplicates and doesn't require that the inputs first be converted to sets. Also, it doesn't require an import.

vote vote

83

Using the unpacking operator *:

>> list(set().union(*a)) [1, '44', '30', '42', '43', '40', '41', '34'] 

(Thanks Raymond Hettinger and ShadowRanger for the comments!)

(Note that

set.union(*tup) 

will unpack to

set.union(tup[0], tup[1], ... tup[n - 1]) 

)

vote vote

78

You can use itertools to perform this action. Let us assume that your list has a variable name A

import itertools  single_list_with_all_values = list(itertools.chain(*A)) single_list_with_all_values.sort()  print set(single_list_with_all_values) 
vote vote

69

In [20]: s Out[20]:  [[1, '34', '44'],  [1, '40', '30', '41'],  [1, '41', '40', '42'],  [1, '42', '41', '43'],  [1, '43', '42', '44'],  [1, '44', '34', '43']] In [31]: list({x for _list in s for x in _list}) Out[31]: [1, '44', '30', '42', '43', '40', '41', '34'] 

Update:

Thanks for the comments

vote vote

50

>>> big = [[1, '34', '44'], [1, '40', '30', '41'], [1, '41', '40', '42'], [1, '42', '41', '43'], [1, '43', '42', '44'], [1, '44', '34', '43']] >>> set(reduce ( lambda l,a : l + a, big)) set([1, '44', '30', '42', '43', '40', '41', '34']) 

And if you really want a list of a list as a final result

>>>>[list(set(reduce ( lambda l,a : l + a, big)))] [[1, '44', '30', '42', '43', '40', '41', '34']] 

And if you don't like recoding a lambda function for the list addition :

>>>>[list(set(reduce ( list.__add__, big)))] [[1, '44', '30', '42', '43', '40', '41', '34']] 

EDIT : after your recommendation about using itertools.chain instead of list.__add__ I ran a timeit for both with the original variable used by the original poster.

It seems that timeit times list.__add__ around 2.8s and itertools.chain around 3.5 seconds.

I checked on this page and yes, you were right with the itertools.chain contains a from_iterable method that grants a huge performance boost. see below with list.__add__, itertools.chain and itertools.chain.from_iterable.

>>> timeit.timeit("[list(set(reduce ( list.__add__, big)))]", setup="big = [ [10,20,30,40] for ele in range(10000)]", number=30) 16.051744650801993 >>> timeit.timeit("[list(set(reduce ( itertools.chain, big)))]", setup="big = [ [10,20,30,40] for ele in range(10000)]", number=30) 54.721315866467194 >>> timeit.timeit("list(set(itertools.chain.from_iterable(big)))", setup="big = [ [10,20,30,40] for ele in range(10000)]", number=30) 0.040056066849501804 

Thank you very much for your advises :)

Top 3 video Explaining Union of multiple sets in python

Related QUESTION?