ubelt.util_list module¶
Utility functions for manipulating iterables, lists, and sequences.
-
class
ubelt.util_list.
chunks
(items, chunksize=None, nchunks=None, total=None, bordermode='none')[source]¶ Bases:
object
Generates successive n-sized chunks from items.
If the last chunk has less than n elements, bordermode is used to determine fill values.
Parameters: - items (Iterable) – input to iterate over
- chunksize (int) – size of each sublist yielded
- nchunks (int) – number of chunks to create ( cannot be specified if chunksize is specified)
- bordermode (str) – determines how to handle the last case if the length of the input is not divisible by chunksize valid values are: {‘none’, ‘cycle’, ‘replicate’}
- total (int) – hints about the length of the input
Todo
should this handle the case when sequence is a string?
References
http://stackoverflow.com/questions/434287/iterate-over-a-list-in-chunks
- CommandLine:
- python -m ubelt.util_list chunks
Example
>>> import ubelt as ub >>> items = [1, 2, 3, 4, 5, 6, 7] >>> genresult = ub.chunks(items, chunksize=3, bordermode='none') >>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7]] >>> genresult = ub.chunks(items, chunksize=3, bordermode='cycle') >>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7, 1, 2]] >>> genresult = ub.chunks(items, chunksize=3, bordermode='replicate') >>> assert list(genresult) == [[1, 2, 3], [4, 5, 6], [7, 7, 7]]
- Doctest:
>>> import ubelt as ub >>> assert len(list(ub.chunks(range(2), nchunks=2))) == 2 >>> assert len(list(ub.chunks(range(3), nchunks=2))) == 2 >>> # Note: ub.chunks will not do the 2,1,1 split >>> assert len(list(ub.chunks(range(4), nchunks=3))) == 2 >>> assert len(list(ub.chunks([], 2, None, 'none'))) == 0 >>> assert len(list(ub.chunks([], 2, None, 'cycle'))) == 0 >>> assert len(list(ub.chunks([], 2, None, 'replicate'))) == 0
- Doctest:
>>> def _check_len(self): ... assert len(self) == len(list(self)) >>> _check_len(chunks(list(range(3)), nchunks=2)) >>> _check_len(chunks(list(range(2)), nchunks=2)) >>> _check_len(chunks(list(range(2)), nchunks=3))
- Doctest:
>>> import pytest >>> assert pytest.raises(ValueError, chunks, range(9)) >>> assert pytest.raises(ValueError, chunks, range(9), chunksize=2, nchunks=2) >>> assert pytest.raises(TypeError, len, chunks((_ for _ in range(2)), 2))
-
ubelt.util_list.
iterable
(obj, strok=False)[source]¶ Checks if the input implements the iterator interface. An exception is made for strings, which return False unless strok is True
Parameters: - obj (object) – a scalar or iterable input
- strok (bool) – if True allow strings to be interpreted as iterable
Returns: True if the input is iterable
Return type: bool
Example
>>> obj_list = [3, [3], '3', (3,), [3, 4, 5], {}] >>> result = [iterable(obj) for obj in obj_list] >>> assert result == [False, True, False, True, True, True] >>> result = [iterable(obj, strok=True) for obj in obj_list] >>> assert result == [False, True, True, True, True, True]
-
ubelt.util_list.
take
(items, indices)[source]¶ Selects a subset of a list based on a list of indices. This is similar to np.take, but pure python.
Parameters: - items (Sequence) – an indexable object to select items from
- indices (Iterable) – sequence of indexing objects
Returns: subset of the list
Return type: Iterable or scalar
- SeeAlso:
- ub.dict_subset
Example
>>> import ubelt as ub >>> items = [0, 1, 2, 3] >>> indices = [2, 0] >>> list(ub.take(items, indices)) [2, 0]
-
ubelt.util_list.
compress
(items, flags)[source]¶ Selects items where the corresponding value in flags is True This is similar to np.compress and it.compress
Parameters: - items (Iterable) – a sequence to select items from
- flags (Iterable) – corresponding sequence of bools
Returns: a subset of masked items
Return type: Iterable
Example
>>> import ubelt as ub >>> items = [1, 2, 3, 4, 5] >>> flags = [False, True, True, False, True] >>> list(ub.compress(items, flags)) [2, 3, 5]
-
ubelt.util_list.
flatten
(nested_list)[source]¶ Transforms a nested iterable into a flat iterable.
This is simply an alias for itertools.chain.from_iterable
Parameters: nested_list (Iterable[Iterable]) – list of lists Returns: flattened items Return type: Iterable Example
>>> import ubelt as ub >>> nested_list = [['a', 'b'], ['c', 'd']] >>> list(ub.flatten(nested_list)) ['a', 'b', 'c', 'd']
-
ubelt.util_list.
unique
(items, key=None)[source]¶ Generates unique items in the order they appear.
Parameters: - items (Iterable) – list of items
- key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Yields: object – a unique item from the input sequence
- CommandLine:
- python -m utool.util_list –exec-unique_ordered
Example
>>> import ubelt as ub >>> items = [4, 6, 6, 0, 6, 1, 0, 2, 2, 1] >>> unique_items = list(ub.unique(items)) >>> assert unique_items == [4, 6, 0, 1, 2]
Example
>>> import ubelt as ub >>> items = ['A', 'a', 'b', 'B', 'C', 'c', 'D', 'e', 'D', 'E'] >>> unique_items = list(ub.unique(items, key=six.text_type.lower)) >>> assert unique_items == ['A', 'b', 'C', 'D', 'e'] >>> unique_items = list(ub.unique(items)) >>> assert unique_items == ['A', 'a', 'b', 'B', 'C', 'c', 'D', 'e', 'E']
-
ubelt.util_list.
argunique
(items, key=None)[source]¶ Returns indices corresponding to the first instance of each unique item.
Parameters: - items (Sequence) – indexable collection of items
- key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Yields: int – indices of the unique items
Example
>>> items = [0, 2, 5, 1, 1, 0, 2, 4] >>> indices = list(argunique(items)) >>> assert indices == [0, 1, 2, 3, 7] >>> indices = list(argunique(items, key=lambda x: x % 2 == 0)) >>> assert indices == [0, 2]
-
ubelt.util_list.
unique_flags
(items, key=None)[source]¶ Returns a list of booleans corresponding to the first instance of each unique item.
Parameters: - items (Sequence) – indexable collection of items
- key (Callable, optional) – custom normalization function. If specified returns items where key(item) is unique.
Returns: flags the items that are unique
Return type: List[bool]
Example
>>> import ubelt as ub >>> items = [0, 2, 1, 1, 0, 9, 2] >>> flags = unique_flags(items) >>> assert flags == [True, True, True, False, False, True, False] >>> flags = unique_flags(items, key=lambda x: x % 2 == 0) >>> assert flags == [True, False, True, False, False, False, False]
-
ubelt.util_list.
boolmask
(indices, maxval=None)[source]¶ Constructs a list of booleans where an item is True if its position is in indices otherwise it is False.
Parameters: - indices (list) – list of integer indices
- maxval (int) – length of the returned list. If not specified this is inferred from indices
Note
In the future the arg maxval may change its name to shape
Returns: mask: list of booleans. mask[idx] is True if idx in indices Return type: list Example
>>> import ubelt as ub >>> indices = [0, 1, 4] >>> mask = ub.boolmask(indices, maxval=6) >>> assert mask == [True, True, False, False, True, False] >>> mask = ub.boolmask(indices) >>> assert mask == [True, True, False, False, True]
-
ubelt.util_list.
iter_window
(iterable, size=2, step=1, wrap=False)[source]¶ Iterates through iterable with a window size. This is essentially a 1D sliding window.
Parameters: - iterable (Iterable) – an iterable sequence
- size (int) – sliding window size (default = 2)
- step (int) – sliding step size (default = 1)
- wrap (bool) – wraparound (default = False)
Returns: returns windows in a sequence
Return type: iter
Example
>>> iterable = [1, 2, 3, 4, 5, 6] >>> size, step, wrap = 3, 1, True >>> window_iter = iter_window(iterable, size, step, wrap) >>> window_list = list(window_iter) >>> print('window_list = %r' % (window_list,)) window_list = [(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 1), (6, 1, 2)]
Example
>>> iterable = [1, 2, 3, 4, 5, 6] >>> size, step, wrap = 3, 2, True >>> window_iter = iter_window(iterable, size, step, wrap) >>> window_list = list(window_iter) >>> print('window_list = %r' % (window_list,)) window_list = [(1, 2, 3), (3, 4, 5), (5, 6, 1)]
Example
>>> iterable = [1, 2, 3, 4, 5, 6] >>> size, step, wrap = 3, 2, False >>> window_iter = iter_window(iterable, size, step, wrap) >>> window_list = list(window_iter) >>> print('window_list = %r' % (window_list,)) window_list = [(1, 2, 3), (3, 4, 5)]
Example
>>> iterable = [] >>> size, step, wrap = 3, 2, False >>> window_iter = iter_window(iterable, size, step, wrap) >>> window_list = list(window_iter) >>> print('window_list = %r' % (window_list,)) window_list = []
-
ubelt.util_list.
allsame
(iterable, eq=<built-in function eq>)[source]¶ Determine if all items in a sequence are the same
Parameters: - iterable (Iterable) – items to determine if they are all the same
- eq (Callable, optional) – function to determine equality (default: operator.eq)
Example
>>> allsame([1, 1, 1, 1]) True >>> allsame([]) True >>> allsame([0, 1]) False >>> iterable = iter([0, 1, 1, 1]) >>> next(iterable) >>> allsame(iterable) True >>> allsame(range(10)) False >>> allsame(range(10), lambda a, b: True) True
-
ubelt.util_list.
argsort
(indexable, key=None, reverse=False)[source]¶ Returns the indices that would sort a indexable object.
This is similar to numpy.argsort, but it is written in pure python and works on both lists and dictionaries.
Parameters: - indexable (Iterable or Mapping) – indexable to sort by
- key (Callable, optional) – customizes the ordering of the indexable
- reverse (bool, optional) – if True returns in descending order
Returns: indices: list of indices such that sorts the indexable
Return type: list
Example
>>> import ubelt as ub >>> # argsort works on dicts by returning keys >>> dict_ = {'a': 3, 'b': 2, 'c': 100} >>> indices = ub.argsort(dict_) >>> assert list(ub.take(dict_, indices)) == sorted(dict_.values()) >>> # argsort works on lists by returning indices >>> indexable = [100, 2, 432, 10] >>> indices = ub.argsort(indexable) >>> assert list(ub.take(indexable, indices)) == sorted(indexable) >>> # Can use iterators, but be careful. It exhausts them. >>> indexable = reversed(range(100)) >>> indices = ub.argsort(indexable) >>> assert indices[0] == 99 >>> # Can use key just like sorted >>> indexable = [[0, 1, 2], [3, 4], [5]] >>> indices = ub.argsort(indexable, key=len) >>> assert indices == [2, 1, 0] >>> # Can use reverse just like sorted >>> indexable = [0, 2, 1] >>> indices = ub.argsort(indexable, reverse=True) >>> assert indices == [1, 2, 0]
-
ubelt.util_list.
argmax
(indexable, key=None)[source]¶ Returns index / key of the item with the largest value.
This is similar to numpy.argmax, but it is written in pure python and works on both lists and dictionaries.
Parameters: - indexable (Iterable or Mapping) – indexable to sort by
- key (Callable, optional) – customizes the ordering of the indexable
- CommandLine:
- python -m ubelt.util_list argmax
Example
>>> assert argmax({'a': 3, 'b': 2, 'c': 100}) == 'c' >>> assert argmax(['a', 'c', 'b', 'z', 'f']) == 3 >>> assert argmax([[0, 1], [2, 3, 4], [5]], key=len) == 1 >>> assert argmax({'a': 3, 'b': 2, 3: 100, 4: 4}) == 3 >>> assert argmax(iter(['a', 'c', 'b', 'z', 'f'])) == 3
-
ubelt.util_list.
argmin
(indexable, key=None)[source]¶ Returns index / key of the item with the smallest value.
This is similar to numpy.argmin, but it is written in pure python and works on both lists and dictionaries.
Parameters: - indexable (Iterable or Mapping) – indexable to sort by
- key (Callable, optional) – customizes the ordering of the indexable
Example
>>> assert argmin({'a': 3, 'b': 2, 'c': 100}) == 'b' >>> assert argmin(['a', 'c', 'b', 'z', 'f']) == 0 >>> assert argmin([[0, 1], [2, 3, 4], [5]], key=len) == 2 >>> assert argmin({'a': 3, 'b': 2, 3: 100, 4: 4}) == 'b' >>> assert argmin(iter(['a', 'c', 'A', 'z', 'f'])) == 2
-
ubelt.util_list.
peek
(iterable)[source]¶ Look at the first item of an iterable. If the input is an iterator, then the next element is exhausted (i.e. a pop operation).
Parameters: iterable (List[T]) – an iterable Returns: - item: the first item of ordered sequence, a popped item from an
- iterator, or an arbitrary item from an unordered collection.
Return type: T Example
>>> import ubelt as ub >>> data = [0, 1, 2] >>> ub.peek(data) 0 >>> iterator = iter(data) >>> print(ub.peek(iterator)) 0 >>> print(ub.peek(iterator)) 1 >>> print(ub.peek(iterator)) 2 >>> ub.peek(range(3)) 0