MongoDB汇总结果到嵌套数组
问题描述:
我对MongoDB相当陌生,目前我正面临一种情况。下面是从整个数据库2条样本记录,我有:MongoDB汇总结果到嵌套数组
{
"_id": 1,
"Record": 1,
"Link": [ "https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html" ],
"Location": [ "USA", "PAN", "USA", "USA", "PAN" ],
"Organization": [ "GN", "SOUTHCOM", "UCMJ", "PRC" ],
"Date": [ "2016" ],
"People": [ "P.Walter" ]
}
{
"_id": 2,
"Record": 2,
"Link": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html" ],
"Location": [ "NIC", "GTM", "JAM", "GTM", "PAN" ],
"Organization": [ "CENTAM", "Calibre Mining Corporation", "STRATFOR", "Alder Resources" ],
"Date": [ "2013" ],
"People": [ "Daniel Ortega", "Hugo Chavez", "Paulo Gregoire" ]
}
基本上,我试图让一个像这样的输出:
{
"Country": "US",
"Years": [
{
"Year": "2016",
"Links": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html",
"https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html",
"https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ]
},
{
"Year": "2013",
"Links": [ ""https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html",
"https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html",
"https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ]
}
]
"Link_Count": 6
}
{
"Country": "UK",
"Years": [
{
"Year": "2009",
"Links": [ "https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html",
"https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html",
"https://wikileaks.org/gifiles/docs/90/9058_wax-12312008-csv-.html" ]
},
{
"Year": "2011",
"Links": [ ""https://wikileaks.org/gifiles/docs/11/111533_-latam-centam-brief-110822-.html",
"https://wikileaks.org/plusd/cables/1979PANAMA06344_e.html"]
}
]
"Link_Count": 5
}
我试着聚集,但我无法达到我想要的效果,就像我在输出中给出的那样。这是我的查询:
db.test.aggregate([
{
"$unwind": "$Location"
},
{
"$group" : {
"_id": {
"Country": "$Location",
"Year": "$Date",
"Links": "$Link"
},
Loc: {
$addToSet: "$Location"
}
}
},
{
"$unwind": "$Loc"
},
{
"$group": {
"_id": "$Loc",
"Years": { "$push": {
"Year": "$_id.Year",
"Links": "$_id.Links"
}
}
}
}
]).toArray()
我在$ Location中使用了$ unwind和$ addToSet,因为$ Location中有重复项。我接受任何建议或解决方案,请告诉我!提前致谢!
答
您可以使用:
db.test.aggregate([{
"$unwind": "$Location"
}, {
"$unwind": "$Date"
}, {
"$unwind": "$Link"
}, {
"$group": {
"_id": {
"Country": "$Location",
"Year": "$Date"
},
Links: {
$addToSet: "$Link"
}
}
}, {
"$group": {
"_id": "$_id.Country",
Years: {
$push: {
"Year": "$_id.Year",
"Links": "$Links"
}
},
Link_Count: { $sum: { $size: "$Links" } }
}
}])
的想法是$unwind
所有阵列能够$push
链接到一个新的数组,并计算与$size
分组记录最后$group
阶段。
嗨,谢谢你的主意!我设法使用上面的查询来获得我想要的输出。但是,我想进一步询问我是否可以在Links上使用$ addToSet,因为我只想得到唯一的链接,上面的查询没有给出,因为它给了我重复。我真的很感激你的帮助,并且很抱歉给你带来麻烦! – Levi0000
查看我的更新答案,以获得'Years.Links'数组中唯一的值,并让'Link_Count'为该国唯一的'Links'值的计数 –