ubelt.util_dict module¶
Functions for working with dictionaries.
-
ubelt.util_dict.
odict
¶ alias of
collections.OrderedDict
-
ubelt.util_dict.
ddict
¶ alias of
collections.defaultdict
-
class
ubelt.util_dict.
AutoDict
[source]¶ Bases:
dict
An infinitely nested default dict of dicts.
Implementation of Perl’s autovivification feature.
- SeeAlso:
- ub.AutoOrderedDict - the ordered version
References
http://stackoverflow.com/questions/651794/init-dict-of-dicts
Example
>>> import ubelt as ub >>> auto = ub.AutoDict() >>> auto[0][10][100] = None >>> assert str(auto) == '{0: {10: {100: None}}}'
-
to_dict
()[source]¶ Recursively casts a AutoDict into a regular dictionary. All nested AutoDict values are also converted.
Returns: a copy of this dict without autovivification Return type: dict Example
>>> from ubelt.util_dict import AutoDict >>> auto = AutoDict() >>> auto[1] = 1 >>> auto['n1'] = AutoDict() >>> static = auto.to_dict() >>> assert not isinstance(static, AutoDict) >>> assert not isinstance(static['n1'], AutoDict)
-
class
ubelt.util_dict.
AutoOrderedDict
[source]¶ Bases:
collections.OrderedDict
,ubelt.util_dict.AutoDict
An infinitely nested default dict of dicts that maintains the ordering of items.
- SeeAlso:
- ub.AutoDict - the unordered version
Example
>>> import ubelt as ub >>> auto = ub.AutoOrderedDict() >>> auto[0][3] = 3 >>> auto[0][2] = 2 >>> auto[0][1] = 1 >>> assert list(auto[0].values()) == [3, 2, 1]
-
ubelt.util_dict.
dzip
(items1, items2, cls=<class 'dict'>)[source]¶ Zips elementwise pairs between items1 and items2 into a dictionary. Values from items2 can be broadcast onto items1.
Parameters: - items1 (Iterable) – full sequence
- items2 (Iterable) – can either be a sequence of one item or a sequence of equal length to items1
- cls (Type[dict]) – dictionary type to use. Defaults to dict, but could be ordered dict instead.
Returns: similar to dict(zip(items1, items2))
Return type: dict
Example
>>> assert dzip([1, 2, 3], [4]) == {1: 4, 2: 4, 3: 4} >>> assert dzip([1, 2, 3], [4, 4, 4]) == {1: 4, 2: 4, 3: 4} >>> assert dzip([], [4]) == {}
-
ubelt.util_dict.
group_items
(items, groupids)[source]¶ Groups a list of items by group id.
Parameters: - items (Iterable) – a list of items to group
- groupids (Iterable or Callable) – a corresponding list of item groupids or a function mapping an item to a groupid.
Returns: groupid_to_items: maps a groupid to a list of items
Return type: dict
- CommandLine:
- python -m ubelt.util_dict group_items
Example
>>> import ubelt as ub >>> items = ['ham', 'jam', 'spam', 'eggs', 'cheese', 'banana'] >>> groupids = ['protein', 'fruit', 'protein', 'protein', 'dairy', 'fruit'] >>> groupid_to_items = ub.group_items(items, groupids) >>> print(ub.repr2(groupid_to_items, nl=0)) {'dairy': ['cheese'], 'fruit': ['jam', 'banana'], 'protein': ['ham', 'spam', 'eggs']}
-
ubelt.util_dict.
dict_hist
(item_list, weight_list=None, ordered=False, labels=None)[source]¶ Builds a histogram of items, counting the number of time each item appears in the input.
Parameters: - item_list (Iterable) – hashable items (usually containing duplicates)
- weight_list (Iterable) – corresponding weights for each item
- ordered (bool) – if True the result is ordered by frequency
- labels (Iterable, optional) – expected labels (default None) Allows this function to pre-initialize the histogram. If specified the frequency of each label is initialized to zero and item_list can only contain items specified in labels.
Returns: - dictionary where the keys are items in item_list, and the values
are the number of times the item appears in item_list.
Return type: dict
- CommandLine:
- python -m ubelt.util_dict dict_hist
Example
>>> import ubelt as ub >>> item_list = [1, 2, 39, 900, 1232, 900, 1232, 2, 2, 2, 900] >>> hist = ub.dict_hist(item_list) >>> print(ub.repr2(hist, nl=0)) {1: 1, 2: 4, 39: 1, 900: 3, 1232: 2}
Example
>>> import ubelt as ub >>> item_list = [1, 2, 39, 900, 1232, 900, 1232, 2, 2, 2, 900] >>> hist1 = ub.dict_hist(item_list) >>> hist2 = ub.dict_hist(item_list, ordered=True) >>> try: >>> hist3 = ub.dict_hist(item_list, labels=[]) >>> except KeyError: >>> pass >>> else: >>> raise AssertionError('expected key error') >>> #result = ub.repr2(hist_) >>> weight_list = [1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1] >>> hist4 = ub.dict_hist(item_list, weight_list=weight_list) >>> print(ub.repr2(hist1, nl=0)) {1: 1, 2: 4, 39: 1, 900: 3, 1232: 2} >>> print(ub.repr2(hist4, nl=0)) {1: 1, 2: 4, 39: 1, 900: 1, 1232: 0}
-
ubelt.util_dict.
find_duplicates
(items, k=2, key=None)[source]¶ Find all duplicate items in a list.
Search for all items that appear more than k times and return a mapping from each (k)-duplicate item to the positions it appeared in.
Parameters: - items (Iterable) – hashable items possibly containing duplicates
- k (int) – only return items that appear at least k times (default=2)
- key (Callable, optional) – Returns indices where key(items[i]) maps to a particular value at least k times.
Returns: maps each duplicate item to the indices at which it appears
Return type: dict
- CommandLine:
- python -m ubelt.util_dict find_duplicates
Example
>>> import ubelt as ub >>> items = [0, 0, 1, 2, 3, 3, 0, 12, 2, 9] >>> duplicates = ub.find_duplicates(items) >>> print('items = %r' % (items,)) >>> print('duplicates = %r' % (duplicates,)) >>> assert duplicates == {0: [0, 1, 6], 2: [3, 8], 3: [4, 5]} >>> assert ub.find_duplicates(items, 3) == {0: [0, 1, 6]}
Example
>>> import ubelt as ub >>> items = [0, 0, 1, 2, 3, 3, 0, 12, 2, 9] >>> # note: k can be 0 >>> duplicates = ub.find_duplicates(items, k=0) >>> print(ub.repr2(duplicates, nl=0)) {0: [0, 1, 6], 1: [2], 2: [3, 8], 3: [4, 5], 9: [9], 12: [7]}
Example
>>> import ubelt as ub >>> items = [10, 11, 12, 13, 14, 15, 16] >>> duplicates = ub.find_duplicates(items, key=lambda x: x // 2) >>> print(ub.repr2(duplicates, nl=0)) {5: [0, 1], 6: [2, 3], 7: [4, 5]}
-
ubelt.util_dict.
dict_subset
(dict_, keys, default=NoParam)[source]¶ Get a subset of a dictionary
Parameters: - dict_ (Mapping) – superset dictionary
- keys (Iterable) – keys to take from dict_
- default (object, optional) – if specified uses default if keys are missing
Returns: subset dictionary
Return type: OrderedDict
- SeeAlso:
- dict_isect - similar functionality, but will only take existing keys
Example
>>> import ubelt as ub >>> dict_ = {'K': 3, 'dcvs_clip_max': 0.2, 'p': 0.1} >>> keys = ['K', 'dcvs_clip_max'] >>> subdict_ = ub.dict_subset(dict_, keys) >>> print(ub.repr2(subdict_, nl=0)) {'K': 3, 'dcvs_clip_max': 0.2}
-
ubelt.util_dict.
dict_take
(dict_, keys, default=NoParam)[source]¶ Generates values from a dictionary
Parameters: - dict_ (Mapping) – a dictionary to take from
- keys (Iterable) – the keys to take
- default (object, optional) – if specified uses default if keys are missing
- CommandLine:
- python -m ubelt.util_dict dict_take_gen
Example
>>> import ubelt as ub >>> dict_ = {1: 'a', 2: 'b', 3: 'c'} >>> keys = [1, 2, 3, 4, 5] >>> result = list(ub.dict_take(dict_, keys, None)) >>> assert result == ['a', 'b', 'c', None, None]
Example
>>> import ubelt as ub >>> dict_ = {1: 'a', 2: 'b', 3: 'c'} >>> keys = [1, 2, 3, 4, 5] >>> try: >>> print(list(ub.dict_take(dict_, keys))) >>> raise AssertionError('did not get key error') >>> except KeyError: >>> print('correctly got key error')
-
ubelt.util_dict.
dict_union
(*args)[source]¶ Combines the disjoint keys in multiple dictionaries. For intersecting keys, dictionaries towards the end of the sequence are given precedence.
Parameters: *args – a sequence of dictionaries Returns: OrderedDict if the first argument is an OrderedDict, otherwise dict Return type: Dict | OrderedDict - SeeAlso:
- collections.ChainMap - a standard python builtin data structure that
- provides a view that treats multiple dicts as a single dict. https://docs.python.org/3/library/collections.html#chainmap-objects
Example
>>> result = dict_union({'a': 1, 'b': 1}, {'b': 2, 'c': 2}) >>> assert result == {'a': 1, 'b': 2, 'c': 2} >>> dict_union(odict([('a', 1), ('b', 2)]), odict([('c', 3), ('d', 4)])) OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)]) >>> dict_union() {}
-
ubelt.util_dict.
dict_isect
(*args)[source]¶ Constructs a dictionary that contains keys common between all inputs. The returned values will only belong to the first dictionary.
Parameters: *args – a sequence of dictionaries (or sets of keys) Returns: OrderedDict if the first argument is an OrderedDict, otherwise dict Return type: Dict | OrderedDict Notes
This function can be used as an alternative to dict_subset where any key not in the dictionary is ignored. See the following example:
>>> dict_isect({'a': 1, 'b': 2, 'c': 3}, ['a', 'c', 'd']) {'a': 1, 'c': 3}
Example
>>> dict_isect({'a': 1, 'b': 1}, {'b': 2, 'c': 2}) {'b': 1} >>> dict_isect(odict([('a', 1), ('b', 2)]), odict([('c', 3)])) OrderedDict() >>> dict_isect() {}
-
ubelt.util_dict.
map_vals
(func, dict_)[source]¶ applies a function to each of the keys in a dictionary
Parameters: - func (callable) – a function or indexable object
- dict_ (dict) – a dictionary
Returns: transformed dictionary
Return type: newdict
- CommandLine:
- python -m ubelt.util_dict map_vals
Example
>>> import ubelt as ub >>> dict_ = {'a': [1, 2, 3], 'b': []} >>> func = len >>> newdict = ub.map_vals(func, dict_) >>> assert newdict == {'a': 3, 'b': 0} >>> print(newdict) >>> # Can also use indexables as `func` >>> dict_ = {'a': 0, 'b': 1} >>> func = [42, 21] >>> newdict = ub.map_vals(func, dict_) >>> assert newdict == {'a': 42, 'b': 21} >>> print(newdict)
-
ubelt.util_dict.
map_keys
(func, dict_)[source]¶ applies a function to each of the keys in a dictionary
Parameters: - func (callable) – a function or indexable object
- dict_ (dict) – a dictionary
Returns: transformed dictionary
Return type: newdict
- CommandLine:
- python -m ubelt.util_dict map_keys
Example
>>> import ubelt as ub >>> dict_ = {'a': [1, 2, 3], 'b': []} >>> func = ord >>> newdict = ub.map_keys(func, dict_) >>> print(newdict) >>> assert newdict == {97: [1, 2, 3], 98: []} >>> #ut.assert_raises(AssertionError, map_keys, len, dict_) >>> dict_ = {0: [1, 2, 3], 1: []} >>> func = ['a', 'b'] >>> newdict = ub.map_keys(func, dict_) >>> print(newdict) >>> assert newdict == {'a': [1, 2, 3], 'b': []} >>> #ut.assert_raises(AssertionError, map_keys, len, dict_)
-
ubelt.util_dict.
invert_dict
(dict_, unique_vals=True)[source]¶ Swaps the keys and values in a dictionary.
Parameters: - dict_ (dict) – dictionary to invert
- unique_vals (bool) – if False, inverted keys are returned in a set. The default is True.
Returns: inverted
Return type: dict
Notes
The must values be hashable.
If the original dictionary contains duplicate values, then only one of the corresponding keys will be returned and the others will be discarded. This can be prevented by setting unique_vals=True, causing the inverted keys to be returned in a set.
- CommandLine:
- python -m ubelt.util_dict invert_dict
Example
>>> import ubelt as ub >>> dict_ = {'a': 1, 'b': 2} >>> inverted = ub.invert_dict(dict_) >>> assert inverted == {1: 'a', 2: 'b'}
Example
>>> import ubelt as ub >>> dict_ = ub.odict([(2, 'a'), (1, 'b'), (0, 'c'), (None, 'd')]) >>> inverted = ub.invert_dict(dict_) >>> assert list(inverted.keys())[0] == 'a'
Example
>>> import ubelt as ub >>> dict_ = {'a': 1, 'b': 0, 'c': 0, 'd': 0, 'f': 2} >>> inverted = ub.invert_dict(dict_, unique_vals=False) >>> assert inverted == {0: {'b', 'c', 'd'}, 1: {'a'}, 2: {'f'}}