葫芦的运维日志_python_列表内字典去重

浏览量 6187

2019/05/03 17:46

set可以去重list里的元素为int、float、str、tuple如下,但是不能去重list、set、dict如下：

>>> a=[(1,2),(1,2)]
>>> set(a)
set([(1, 2)])
>>> a=[[1,2],[1,2]]
>>> set(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

那么哪些是可哈希元素？哪些是不可哈希元素？
可哈希的元素有：int、float、str、tuple
不可哈希的元素有：list、set、dict


>>> list.__hash__
>>> int.__hash__
<slot wrapper '__hash__' of 'int' objects>
>>> tuple.__hash__
<slot wrapper '__hash__' of 'tuple' objects>

为什么 list 是不可哈希的，而 tuple 是可哈希的
（1）因为 list 是可变的在它的生命期内，你可以在任意时间改变其内的元素值。
（2）所谓元素可不可哈希，意味着是否使用 hash 进行索引
（3）list 不使用 hash 进行元素的索引，自然它对存储的元素有可哈希的要求；而 set 使用 hash 值进行索引。

综上所述，要去重列表内的重复字典只能自己写方法了。

知识点:

1.reduce() 函数会对参数序列中元素进行累积。

函数将一个数据集合（链表，元组等）中的所有数据进行下列操作：用传给 reduce 中的函数 function（有两个参数）先对集合中的第 1、2 个元素进行操作，得到的结果再与第三个数据用 function 函数运算，最后得到一个结果。

2. 三元表达式:

result= x if y in x else x + [y]

result在 y in x 为true时返回 x，为false时返回 x + [y]

3.lambda可写为:

run_function(x,y):

return lambda x, y: x if y in x else x + [y]

def list_dict_duplicate_removal(data_list):
    run_function = lambda x, y: x if y in x else x + [y]
    return reduce(run_function, [[], ] + data_list)

>>> a=[1,3,4,5,2,3,4]
>>> set(a)
set([1, 2, 3, 4, 5])
>>> a=[{"a":123,"b":342},{"a":213,"b":231},{"a":123,"b":221},{"a":123,"b":342}]
>>> set(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
>>> def list_dict_duplicate_removal(data_list):
...     run_function = lambda x, y: x if y in x else x + [y]
...     return reduce(run_function, [[], ] + data_list)
...
>>> list_dict_duplicate_removal(a)
[{'a': 123, 'b': 342}, {'a': 213, 'b': 231}, {'a': 123, 'b': 221}]

葫芦的运维日志

打赏