Trying to move an array of JSON objects to its parent level during aggregation

Simplified sample JSON:

{
	"pg": [
		{
			"id": 1,
			"g": [
				{
					"n": [
						"1083825517"
					],
					"t": {
						"y": "e",
						"v": "271485099"
					}
				},
				{
					"n": [
						"1255611059"
					],
					"t": {
						"y": "e",
						"v": "461540880"
					}
				}
			]
		},
		{
			"id": 2,
			"g": [
				{
					"n": [
						"1295891612"
					],
					"t": {
						"y": "e",
						"v": "081367219"
					}
				},
				{
					"n": [
						"1679779144",
						"1922203793",
						"1952685992"
					],
					"t": {
						"y": "e",
						"v": "141940881"
					}
				}
			]
		}
	]
}

Desired result:

{
	"pg": [
		{
			"id": 1,
			"n": [
				"1083825517"
			],
			"t": {
				"y": "e",
				"v": "271485099"
			}
		},
		{
			"n": [
				"1255611059"
			],
			"t": {
				"y": "e",
				"v": "461540880"
			}
		},
		{
			"id": 2,
			"n": [
				"1295891612"
			],
			"t": {
				"y": "e",
				"v": "081367219"
			}
		},
		{
			"n": [
				"1679779144",
				"1922203793",
				"1952685992"
			],
			"t": {
				"y": "e",
				"v": "141940881"
			}
		}
	]
}

The data was first migrated from SQL Server to a collection, and now I am trying to use aggregation to ‘massage’ it into the right format. The format is not alterable. There are 400+ ids, and in each there are 50-500+ n/t combinations, and the n value ranges from 1 to 3-500 unique values.

Any help is appreciated.

Hi wjohnson, welcome to the community!

Quick question, that simple example json, is that one document from a collection of documents like it (and therefore will have an _id field too) or is it that there’s just one document?

Dj

Ok, assuming that there are multiple documents with their own pg section (and _id), here’s an aggregation which should work:

db.getCollection("refold").aggregate(
    [
        { 
            "$unwind" : { 
                "path" : "$pg"
            }
        }, 
        { 
            "$unwind" : { 
                "path" : "$pg.g"
            }
        }, 
        { 
            "$group" : { 
                "_id" : "$_id", 
                "pg" : { 
                    "$push" : { 
                        "id" : "$pg.id", 
                        "n" : "$pg.g.n", 
                        "t" : "$pg.g.t"
                    }
                }
            }
        }
    ], 
    { 
        "allowDiskUse" : false
    }
);

It basically unwinds the pg array, then the g array within that, then rolls it all back up again, grouping based on the document id _id and pushing the id and n and t values into an object and into your new pg array. This does mean all the elements of the pg array have an id field now - unlike your original desired result, but that will likely make them easier to handle.

Dj

FYI I am on Chicago time.

I ran your code and it worked flawlessly, but I think I screwed up what I wanted as a desired result.

I am going to work with your code and see if I can figure it out, but what I am trying to do (based on the sample), is to have each id have multiple n and t values.

Here is the corrected sample of the desired result:

{
	"pg": [
		{
			"id": 1,
			"n": [
				"1083825517"
			],
			"t": {
				"y": "e",
				"v": "271485099"
			},
			"n": [
				"1255611059"
			],
			"t": {
				"y": "e",
				"v": "461540880"
			},
			"id": 2,
			"n": [
				"1295891612"
			],
			"t": {
				"y": "e",
				"v": "081367219"
			},
			"n": [
				"1679779144",
				"1922203793",
				"1952685992"
			],
			"t": {
				"y": "e",
				"v": "141940881"
			}
		}
	]
}

I really appreciate your help, and I am sorry I screwed up the request.

Bill

I’m afraid your desired result isn’t valid JSON. It has multiple fields of the same name in the same object, an object wrapped in an array but thats the only element of that array.

The real question now is what do you want to do with the resulting dataset? That will better determine how to structure things.

Thanks for your help. We received an updated schema that now more closely fits our data.