Django查询子记录没有得到重复的行

问题描述:

我想写一个Django查询来查找一组父记录与某些种类的子记录。问题是带有两个匹配搜索子项的父记录将包含两次结果。Django查询子记录没有得到重复的行

即使有多个匹配的孩子,我如何才能让每个父母都有一次?

我已经在下面包含了一个简单的例子来演示这个问题。 Blog是父母,Entry是孩子。当我在标题中搜索包含“Hello”的条目的博客时,我会收到Jimmy博客的两份副本。

下面是我创建的记录和查询我想:

b = Blog(name="Jimmy's Jottings") 
    b.save() 
    Entry(blog=b, headline='Hello, World!').save() 
    Entry(blog=b, headline='Hello Kitty').save() 

    blog_count = Blog.objects.filter(entries__headline__contains='Hello').count() 
    assert blog_count == 1, blog_count 

你可以看到,只有一个博客,但断言失败的两个计数。

下面是完整的例子:

# Tested with Django 1.9.2 
import sys 

import django 
from django.apps import apps 
from django.apps.config import AppConfig 
from django.conf import settings 
from django.db import connections, models, DEFAULT_DB_ALIAS 
from django.db.models.base import ModelBase 

NAME = 'udjango' 


def main(): 
    setup() 

    class Blog(models.Model): 
     name = models.CharField(max_length=100) 
     tagline = models.TextField() 

     def __str__(self):    # __unicode__ on Python 2 
      return self.name 

    class Entry(models.Model): 
     blog = models.ForeignKey(Blog, related_name='entries') 
     headline = models.CharField(max_length=255) 
     body_text = models.TextField() 

     def __str__(self):    # __unicode__ on Python 2 
      return self.headline 

    syncdb(Blog) 
    syncdb(Entry) 

    b = Blog(name="Jimmy's Jottings") 
    b.save() 
    Entry(blog=b, headline='Hello, World!').save() 
    Entry(blog=b, headline='Hello Kitty').save() 

    blog_count = Blog.objects.filter(entries__headline__contains='Hello').count() 
    assert blog_count == 1, blog_count 

    print('Done.') 


def setup(): 
    DB_FILE = NAME + '.db' 
    with open(DB_FILE, 'w'): 
     pass # wipe the database 
    settings.configure(
     DEBUG=True, 
     DATABASES={ 
      DEFAULT_DB_ALIAS: { 
       'ENGINE': 'django.db.backends.sqlite3', 
       'NAME': DB_FILE}}, 
     LOGGING={'version': 1, 
       'disable_existing_loggers': False, 
       'formatters': { 
        'debug': { 
         'format': '%(asctime)s[%(levelname)s]' 
            '%(name)s.%(funcName)s(): %(message)s', 
         'datefmt': '%Y-%m-%d %H:%M:%S'}}, 
       'handlers': { 
        'console': { 
         'level': 'DEBUG', 
         'class': 'logging.StreamHandler', 
         'formatter': 'debug'}}, 
       'root': { 
        'handlers': ['console'], 
        'level': 'WARN'}, 
       'loggers': { 
        "django.db": {"level": "WARN"}}}) 
    app_config = AppConfig(NAME, sys.modules['__main__']) 
    apps.populate([app_config]) 
    django.setup() 
    original_new_func = ModelBase.__new__ 

    @staticmethod 
    def patched_new(cls, name, bases, attrs): 
     if 'Meta' not in attrs: 
      class Meta: 
       app_label = NAME 
      attrs['Meta'] = Meta 
     return original_new_func(cls, name, bases, attrs) 
    ModelBase.__new__ = patched_new 


def syncdb(model): 
    """ Standard syncdb expects models to be in reliable locations. 

    Based on https://github.com/django/django/blob/1.9.3 
    /django/core/management/commands/migrate.py#L285 
    """ 
    connection = connections[DEFAULT_DB_ALIAS] 
    with connection.schema_editor() as editor: 
     editor.create_model(model) 

main() 

诀窍是使用查找匹配的孩子的博客ID的子查询,然后寻找有在该子查询的ID的所有博客。然后子查询可以有重复而不会导致主要查询中的重复。

这里的固定查询:

blog_ids = Entry.objects.filter(headline__contains='Hello').values('blog_id') 
blog_count = Blog.objects.filter(id__in=blog_ids).count() 
assert blog_count == 1, blog_count 

这里的SQL查询生成:

SELECT COUNT(*) AS "__count" 
FROM "udjango_blog" 
WHERE "udjango_blog"."id" IN 
     (
     SELECT U0."blog_id" 
     FROM "udjango_entry" U0 
     WHERE U0."headline" LIKE '%Hello%' ESCAPE '\' 
     ) 

虽然我相信唐Kirby的响应的作品,我觉得一个更好的解决办法是添加.distinct( )在查询集的末尾。这简单地消除了查询结果中的任何重复行。 SQL的等价物是在给定的查询中使用SELECT DISTINCT。