5/24/12

cyclic json

I think it's useful sometimes to serialize cyclic data structures as JSON. Douglas Crockford wrote a nifty way of doing it here. The basic idea is to "decycle" a data structure before serializing it as JSON, replacing redundant references to an object with string-paths pointing to a single version of that object within the data structure. When deserializing such an object, the original links are restored with a "recycle" function.

I had two concerns with Crockford's implementation: First, I was afraid it would be too slow. Crockford's comments point out that each time the algorithm processes an object, it does a linear search to see if it has processed the object already, i.e., O(n2) running time. My solution (premature optimization?) is to add a tag to objects in the original data structure, and just check for this tag when processing new objects. These tags are removed before the method returns.

Second, Crockford marked cyclic references with an object that looks like this: { "$ref" : "$[\"path\"][2][\"object\"]" }. This breaks if an object happens to contain "$ref" as a key. My solution is to find a unique key, which means I need to say somewhere what that key is. (update 1/30/12) I do this by adding a wrapper object with a property called "cycle_root". The wrapper object has another property where the key is the cycle_root, and the value is the decycled data structure. Press "run me" above to see what this looks like. You can also play with the code to see how different data structures are handled.

(update 1/4/12) Third, the original algorithm processes the objects depth first. Now it processes the objects breadth first. To see why this might be useful, consider an array where each element contains a pointer to the next element in the array:

depth-first (old)
{
    "cycle_root": "root_0",
    "root_0": [
        {
            "next": {
                "next": {
                    "next": {
                        "next": "root_0[0]"
                    }
                }
            }
        },
        "root_0[0][\"next\"]",
        "root_0[0][\"next\"][\"next\"]",
        "root_0[0][\"next\"][\"next\"][\"next\"]"
    ]
}
breadth-first (new)
{
    "cycle_root": "root_0",
    "root_0": [
        {
            "next": "root_0[1]"
        },
        {
            "next": "root_0[2]"
        },
        {
            "next": "root_0[3]"
        },
        {
            "next": "root_0[0]"
        }
    ]
}

No comments:

Post a Comment