人生苦短
早用 Python
- raw_list = [["百度", "CPY"], ["京东", "CPY"], ["黄轩", "PN"], ["百度", "CPY"]]
列表嵌套了列表,并且有一个重复列表 ["百度", "CPY"],现在要求将这个重复元素进行去重(重复是指嵌套的列表内两个元素都相同),并且保证元素顺序不变,输出还是嵌套列表,即最后结果应该长这样:
- [["百度", "CPY"], ["京东", "CPY"], ["黄轩", "PN"]]
正常 Python 去重都是使用 set,所以我这边也是用这种思想处理一下
- In [8]: new_list = [list(t) for t in set(tuple(_) for _ in raw_list)]
- In [9]: new_list
- Out[9]: [['京东', 'CPY'], ['百度', 'CPY'], ['黄轩', 'PN']]
=。= 以为大功告成,结果发现嵌套列表顺序变了
好吧一步步找一下是从哪边顺序变了的
- In [10]: s = set(tuple(_) for _ in raw_list)
- In [11]: s
- Out[11]: {('京东', 'CPY'), ('百度', 'CPY'), ('黄轩', 'PN')}
恍然大悟关于 set 的两个关键词:无序 和 不重复 =。=
所以从 set 解决排序问题基本无望了,然而我还没有放弃,现在问题就变成了对于 new_list 怎么按照 raw_list 元素顺序排序,当然肯定要通过 sort 实现
翻一下 Python 文档找到以下一段话
文档地址
- sort(*, key=None, reverse=False)
- This method sorts the list in place, using only < comparisons between
- items. Exceptions are not suppressed - if any comparison operations
- fail, the entire sort operation will fail (and the list will likely be left in a
- partially modified state).
- [`sort()`](https://docs.python.org/3/library/stdtypes.html?highlight=sort#list.sort "list.sort")
- accepts two arguments that can only be passed by keyword ( [keyword-only arguments](https://docs.python.org/3/glossary.html#keyword-only-parameter) ):
- key specifies a function of one argument that is used to extract a
- comparison key from each list element (for example, key=str.lower).
- The key corresponding to each item in the list is calculated once and then used for the entire sorting process. The default value of None
- means that list items are sorted directly without calculating a separate
- key value.
sort 方法通过参数 key 指定一个方法,换句话说,key 参数的值是函数。
这个函数和 new_list 上的每个元素会产生一个结果,sort 通过这个结果进行排序。
于是这里就想到求出 new_list 里的每一个元素在 raw_list 里的索引,根据这个索引进行排序。
- In [13]: new_list.sort(key=raw_list.index)
- In [14]: new_list
- Out[14]: [['百度', 'CPY'], ['京东', 'CPY'], ['黄轩', 'PN']]
结果和期望一样 =。=
来源: http://www.jianshu.com/p/3eff6fbd0adc