Python, JSON and undefined
If you need to test-to-fail1 communication with component using JSON in Python 2.x script, it will be necessary to send invalid JSONs without attributes or with illegal attributes like null values (beside other things). You might say: "Just use dictionary, skip missing attributes and don't bother me!". That's a valid remark, but if you use custom objects to represent JSON, you'll find a little problem - for example using None to indicate that this attribute won't be part of the messages is not ideal solution, you would loose null values.
And it makes sense to use it even with plain dict to explicitly show that you are trying to send something with one attribute missing in a test.
Undefined comes to help...
You know what? Let's start with JSON's origin - Javascript! First I'll take my Captain Obvious mask and say something about Javascript and JSON.
JSON, JavaScript Object Notation, is native representation of objects in Javascript. To stringify Javascript object and print to stdout, one might use the following in nodejs:
value = {"missing": null, "null": null, "0": 0} string = JSON.stringify(value) console.log(string)
{"0":0,"missing":null,"null":null}
Ah, missing is not missing! But fortunately Javascript has undefined value that represents nonexistent variable.
value = {"missing": undefined, "null": null, "0": 0} string = JSON.stringify(value) console.log(string)
And finally missing is missing!
Python has no undefined, but hey, it's quite easy to create simple Undefined for encoding to JSON.
class UndefinedType(object): pass Undefined = UndefinedType
... and json in Python comes to harm
Ok, we have Undefined. What's next? Custom class that improves encoding behavior of json.JSONEncoder looks like a good way. No, really, it's not. You can overload behavior of the default method, but you can't tell to encoder to actually skip the value. And if you try to return None, it will be interpreted as null. Try it yourself:
import json class UndefinedType(object): pass Undefined = UndefinedType() class CustomEncoder(json.JSONEncoder): def default(self, obj): if obj is Undefined: return None return super(CustomEncoder, self).default(obj) print json.dumps({"missing": Undefined, "None": None, "0": 0}, cls=CustomEncoder)
{"0": 0, "None": null, "missing": null}
You might think that you can overload behavior of dict or better collections.Mapping with default method. No, you can't. It's because dict is special. I really love special things with special rules. And I love their designers. I love them so much that I plan to send them an anthrax postcard.2
Now you have a choice to look for other library or to get a little dirty3. Unfortunately in more corporate environment like me you need to stick with json that comes with Python. Or, yes, fill some form, send it to you-don't-know-who in you-don't-know-where and wait few days. Or few weeks.
Let's get dirty. Just a little.
In my suggested solution bellow I've overloaded encode method for instances of dict. You can overload methods of json.JSONEncoder that begin with underscore4, but you know, their first character is underscore and it's not supposed to be overloaded. This solution looks ok enough for dictionaries without nested dictionaries. For nested dictionaries you can always recursively check the whole object tree and recreate it, but it's a pain and it's not my use case. You've been warned.
The default method was overloaded too, so unrecognized types (i.e. anything else than list, dict, int, float and probably more) are investigated. If it's not a method, named with underscore at the beginning or its value is not Undefined it's returned in dictionary that represents given object. Properties are dead, and again, this one is not my use case.
Ok. I'll show you the money now:
class UndefinedType(object): pass Undefined = UndefinedType import json class CustomEncoder(json.JSONEncoder): def default(self, obj): # return None for Undefined, it's probably item in an array # and it's actually behavior of Javascript if obj is Undefined: return None attributes = {} obj_dict = obj.__dict__ for key, value in obj_dict.iteritems(): # skip: callables, [_]+private and Undefined values if callable(value) is True or key.startswith('_') or value is Undefined: continue attributes[key] = value return attributes def encode(self, obj): changed_obj = obj if isinstance(obj, dict): changed_obj = {} for key, value in obj.iteritems(): if value is not Undefined: changed_obj[key] = value return super(CustomEncoder, self).encode(changed_obj)
As described above, it's really not bulletproof and it's very likely that there are better solutions. See very simple showcase:
print json.dumps({"missing": Undefined, "None": None, "0": 0}, cls=CustomEncoder) # {"0": 0, "None": null, "missing": null} class Test(object): def __init__(self, a, b): self._no = 1 self.missing = Undefined self.a = a self.b = b self.yes = True def method(self): pass test = Test(None, Undefined) print json.dumps(test, cls=CustomEncoder) # {"a": null, "yes": true}
I'll send it to gists in near future and after some polishing. Well, if it has any value to someone at all!
I'll write something about test-to-fail and test-to-pass in the future, for now you can find more about it for example in Software testing by Ron Patton. ↩︎
Yeah, anthrax postcard makes no sense, but it sounds hilarious to me. And I wanted to tease CIA crawlers. And by the way I understand that this decision was driven by performance considerations, but still, I hate it. ↩︎
But I've briefly checked a few libraries available for JSON parsing (especially encoding) and I think there is no choice, everything is trying to be "performance wise" ↩︎
You can call them private or protected, but they are something else. ↩︎