按类别创建项目列表的限制排列

问题描述：

我正在尝试创建一些项目列表的限制排列。每个项目都有一个类别，我需要找到项目的组合，以便每个组合不包含来自同一类别的多个项目。为了说明，这里有一些示例数据：按类别创建项目列表的限制排列

Name  | Category 
    ==========|========== 
1. Orange | fruit 
2. Apple  | fruit 
3. GI-Joe | toy 
4. VCR  | electronics 
5. Racquet | sporting goods

组合将被限制在三个长度，我不需要每个长度的每个组合。因此，上述清单的一组组合可能是：

(Orange, GI-Joe, VCR) 
(Orange, GI-Joe, Racquet) 
(Orange, VCR, Racquet) 
(Apple, GI-Joe, VCR) 
(Apple, GI-Joe, Racquet) 
... and so on.

我经常这样做，在各种名单上。列表永远不会超过40个项目的长度，但可以理解的是，可以创建数千个组合（尽管每个列表可能会有大约10个独特类别，限制它）

我已经想出了一些伪-python，我将如何递归地实现它。自从我使用combinatorics的时间太长了，但从我记得这基本上是组合的一个子集，就像C（列表长度，所需的大小）一样。有可能一些库模块，它可以使这种清洁剂（或至少具有更好的性能）

我想知道是否可能有比我有什么更好的方法（也许有它使用itertools.combinations不知）：

# For the sake of this problem, let's assume the items are hashable so they 
# can be added to a set. 

def combinate(items, size=3): 
    assert size >=2, "You jerk, don't try it." 
    def _combinate(index, candidate): 
     if len(candidate) == size: 
      results.add(candidate) 
      return 
     candidate_cats = set(x.category for x in candidate) 
     for i in range(index, len(items)): 
      item = items[i] 
      if item.category not in candidate_cats: 
       _combinate(i, candidate + (item,)) 

    results = set() 
    for i, item in enumerate(items[:(1-size)]): 
     _combinate(i, (item,)) 

    return results

答

幼稚的做法：

#!/usr/bin/env python 

import itertools 

items = { 
    'fruits' : ('Orange', 'Apple'), 
    'toys' : ('GI-Joe',), 
    'electronics' : ('VCR',), 
    'sporting_goods' : ('Racquet',) 
} 

def combinate(items, size=3): 
    if size > len(items): 
     raise Exception("Lower the `size` or add more products, dude!") 

    for cats in itertools.combinations(items.keys(), size): 
     cat_items = [[products for products in items[cat]] for cat in cats] 
     for x in itertools.product(*cat_items): 
      yield zip(cats, x) 

if __name__ == '__main__': 
    for x in combinate(items): 
     print x

将产生：

# ==> 
# 
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('sporting_goods', 'Racquet')] 
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Orange')] 
# [('electronics', 'VCR'), ('toys', 'GI-Joe'), ('fruits', 'Apple')] 
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')] 
# [('electronics', 'VCR'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')] 
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Orange')] 
# [('toys', 'GI-Joe'), ('sporting_goods', 'Racquet'), ('fruits', 'Apple')]

您编辑的实现看起来或多或少正是我所需要的。谢谢！ – Crast 2010-09-10 17:59:29

不客气。 – miku 2010-09-10 20:39:27

答

您试图生成的是从category集合中取得的元素的笛卡尔product。

划分成多组相对容易：

item_set[category].append(item)

通过适当实例（例如）为collections.defaultdictitem_set[category]然后itertools.product会给你所需的输出。

啊，是的笛卡尔产品。我的大脑不停地想着'克莱恩之星'，然后不断拒绝它，因为那些是无限的。谢谢你，你的评论和MYYN的编辑实施应该让我朝着正确的方向前进 – Crast 2010-09-10 17:56:34

的确，你看起来像你正在教鱼，不给鱼，但这并不意味着未来的问题是不受欢迎的。 – msw 2010-09-10 22:29:51

按类别创建项目列表的限制排列

相关推荐