# python - How do you split a list into evenly sized chunks?

ID : 238

viewed : 84

### Top 5 Answer for python - How do you split a list into evenly sized chunks?  92

Here's a generator that yields the chunks you want:

``def chunks(lst, n):     """Yield successive n-sized chunks from lst."""     for i in range(0, len(lst), n):         yield lst[i:i + n] ``

``import pprint pprint.pprint(list(chunks(range(10, 75), 10))) [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],  [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],  [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],  [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],  [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],  [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],  [70, 71, 72, 73, 74]] ``

If you're using Python 2, you should use `xrange()` instead of `range()`:

``def chunks(lst, n):     """Yield successive n-sized chunks from lst."""     for i in xrange(0, len(lst), n):         yield lst[i:i + n] ``

Also you can simply use list comprehension instead of writing a function, though it's a good idea to encapsulate operations like this in named functions so that your code is easier to understand. Python 3:

``[lst[i:i + n] for i in range(0, len(lst), n)] ``

Python 2 version:

``[lst[i:i + n] for i in xrange(0, len(lst), n)] ``  88

If you want something super simple:

``def chunks(l, n):     n = max(1, n)     return (l[i:i+n] for i in range(0, len(l), n)) ``

Use `xrange()` instead of `range()` in the case of Python 2.x  76

I know this is kind of old but nobody yet mentioned `numpy.array_split`:

``import numpy as np  lst = range(50) np.array_split(lst, 5) # [array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), #  array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]), #  array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29]), #  array([30, 31, 32, 33, 34, 35, 36, 37, 38, 39]), #  array([40, 41, 42, 43, 44, 45, 46, 47, 48, 49])] ``  67

Directly from the (old) Python documentation (recipes for itertools):

``from itertools import izip, chain, repeat  def grouper(n, iterable, padvalue=None):     "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"     return izip(*[chain(iterable, repeat(padvalue, n-1))]*n) ``

The current version, as suggested by J.F.Sebastian:

``#from itertools import izip_longest as zip_longest # for Python 2.x from itertools import zip_longest # for Python 3.x #from six.moves import zip_longest # for both (uses the six compat library)  def grouper(n, iterable, padvalue=None):     "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"     return zip_longest(*[iter(iterable)]*n, fillvalue=padvalue) ``

I guess Guido's time machine works—worked—will work—will have worked—was working again.

These solutions work because `[iter(iterable)]*n` (or the equivalent in the earlier version) creates one iterator, repeated `n` times in the list. `izip_longest` then effectively performs a round-robin of "each" iterator; because this is the same iterator, it is advanced by each such call, resulting in each such zip-roundrobin generating one tuple of `n` items.  57

I'm surprised nobody has thought of using `iter`'s two-argument form:

``from itertools import islice  def chunk(it, size):     it = iter(it)     return iter(lambda: tuple(islice(it, size)), ()) ``

Demo:

``>>> list(chunk(range(14), 3)) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)] ``

This works with any iterable and produces output lazily. It returns tuples rather than iterators, but I think it has a certain elegance nonetheless. It also doesn't pad; if you want padding, a simple variation on the above will suffice:

``from itertools import islice, chain, repeat  def chunk_pad(it, size, padval=None):     it = chain(iter(it), repeat(padval))     return iter(lambda: tuple(islice(it, size)), (padval,) * size) ``

Demo:

``>>> list(chunk_pad(range(14), 3)) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)] >>> list(chunk_pad(range(14), 3, 'a')) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')] ``

Like the `izip_longest`-based solutions, the above always pads. As far as I know, there's no one- or two-line itertools recipe for a function that optionally pads. By combining the above two approaches, this one comes pretty close:

``_no_padding = object()  def chunk(it, size, padval=_no_padding):     if padval == _no_padding:         it = iter(it)         sentinel = ()     else:         it = chain(iter(it), repeat(padval))         sentinel = (padval,) * size     return iter(lambda: tuple(islice(it, size)), sentinel) ``

Demo:

``>>> list(chunk(range(14), 3)) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13)] >>> list(chunk(range(14), 3, None)) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, None)] >>> list(chunk(range(14), 3, 'a')) [(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 'a')] ``

I believe this is the shortest chunker proposed that offers optional padding.

As Tomasz Gandor observed, the two padding chunkers will stop unexpectedly if they encounter a long sequence of pad values. Here's a final variation that works around that problem in a reasonable way:

``_no_padding = object() def chunk(it, size, padval=_no_padding):     it = iter(it)     chunker = iter(lambda: tuple(islice(it, size)), ())     if padval == _no_padding:         yield from chunker     else:         for ch in chunker:             yield ch if len(ch) == size else ch + (padval,) * (size - len(ch)) ``

Demo:

``>>> list(chunk([1, 2, (), (), 5], 2)) [(1, 2), ((), ()), (5,)] >>> list(chunk([1, 2, None, None, 5], 2, None)) [(1, 2), (None, None), (5, None)] ``