My Experience in Serializer-Land (Django Rest Framework)

I was stuck on an issue with Serializers in Django Rest Framework for almost 7 hours yesterday. The docs are not very helpful sometimes, so I would like to share some notes about the things that I have discovered in Serializer-land for those who might come across the same issues.

Here is a Django 1.8.2 example that I've written of a Book model and a BookSerializer:

class Book(models.Model):  
    subject = models.CharField(max_length=8)
    title = models.CharField(max_length=1024)

class BookSerializer(serializers.ModelSerializer):  
    subject = serializers.CharField(max_length=1024)

    def validate_subject(self, attrs, source):
        return {
            'Art, Architecture & Photography': 'AAP'
        }.get(attrs.get(source)) or attrs.get(source)

    def transform_subject(self, obj, value):
        return {
            'Art, Architecture & Photography': 'AAP'
        }.get(obj.subject) or obj.subject

    class Meta:
        model = Book
        fields = ('id', 'title', 'subject', )

I will walk through various use cases I thought could use a helpful explanation.

A step by step walkthrough of deserialization

The following code is the way to deserialize an object, validate its contents, and save it:

data = {'title': 'Othello (Arden Shakespeare.Third Series)', 'subject': 'Art, Architecture & Photography'}  
book_serializer = BookSerializer(data=data)  
if book_serializer.is_valid():  
    book_serializer.save()

The code path for a DRF serializer is: is_valid() => errors() => from_native() => restore_fields()=> perform_validation() => restore_object().

is_valid() tries to validate the data dictionary. If validation fails, then errors() will return with the relevant errors. If it passes, from_native() will be called.

from_native() will call restore_fields() to check if data matches the field declarations of the serializer. Then it will convert the dictionary of data into a dictionary of deserialized fields, and then hand it over to the perform_validation() method.

If validation passes, then restore_object() will be called to save the deserialized data to an object, and will be set to self.object in errors():

def errors(self):  
...
     ret = self.from_native(data, files)

     if not self._errors:
          self.object = ret

A few important points about Validators -- in our example, we didn't define the title field in BookSerializer, which means this field will not be checked. You can still implement a validator for that field like what we do for the subject field, but it is important to remember that attrs in the validator function will not have the value of the title field. Instead, you'll have to read the input from init_data instead.

The perform_validation() method will run all validate_<fieldname>() and validate().

There is something you should keep in mind when implementing the validators: don't forget to return attrs. Validators should always return either attrs or an ValidationError, since self.object will be set by the return value of these validators.

Now that we have a Book object ready, we are ready to store it into the database.

A note about saving

book_serializer.save() will save the Book object we created in is_valid() into the database. The main difference between ModelSerializer and Serializer is the restore_object() method. ModelSerializer will always create an instance if there is none, and the Serializer one will just return the deserialized dictionary.

If you are using Serializer or BaseSerializer, you have to override restore_object() method to make the book_serializer.save() function call work properly, like so:

class BookSerializer(serializers.Serializer):  
    subject = serializers.CharField()

    def validate_subject(self, attrs, source):
        return {
            'Art, Architecture & Photography': 'AAP'
        }.get(attrs.get(source)) or attrs.get(source)

    def transform_subject(self, obj, value):
        return {
            'Art, Architecture & Photography': 'AAP'
        }.get(obj.subject) or obj.subject

    def restore_object(self, attrs, instance=None):
        if instance is None:
            return Book(**attrs)
        return instance

Serializing objects

You can call .data to serialize self.object:

books = Book.objects.all()  
books_serializer = BookSerializer(books)  
books_serializer.data  

Since self.object is initialized by the instance argument of the serializer, books can be serialized correctly without running any other methods.

transform_*() methods are essentially the reverse of validators. In the BookSerializer example at the beginning of this post, we save 'AAP' to the database to save space, and transform it back to a human-readable string when serializing it, like so:

def transform_subject(self, obj, value):  
    return {
        'Art, Architecture & Photography': 'AAP'
    }.get(obj.subject) or obj.subject

Saving nested objects

Let's make our example a bit more complicated. The subject field is not a plain char field anymore, but a nested object instead.

class BookSerializer(serializer.ModelSerializer):  
    subject = SubjectSerializer()

    class Meta:
        model = Book
        fields = ('id', 'title', 'subject', )

We will run into an error with this implementation: "AttributeError: 'dict' object has no attribute ‘save’". If you search this error in Google, you will end up with tons of the same errors but nothing related to what you are suffering.

After diving into the source code, I found out that the problem is in restore_object() method of ModelSerializer:

def restore_object(self, attrs, instance=None):  
    ...
    # Nested forward relations - These need to be marked so we can save
    # them before saving the parent model instance.
    for field_name in attrs.keys():
        if isinstance(self.fields.get(field_name, None), Serializer):
            nested_forward_relations[field_name] = attrs[field_name]
    ...

    instance._nested_forward_relations = nested_forward_relations

This _nested_forward_relations field will be used in save_object():

def save_object(self, obj, **kwargs):  
    ...
    if getattr(obj, '_nested_forward_relations', None):
        # Nested relationships need to be saved before we can save the
        # parent instance.
        for field_name, sub_object in obj._nested_forward_relations.items():
            if sub_object:
                self.save_object(sub_object)
            setattr(obj, field_name, sub_object)
    ...

What's interesting here is that ModelSerializer assumes the fields which have their own serializers should be nested forward relations.

In our example, subject field has its own serializer, but it is not a model that exists in the database. That is why it has trouble saving.

My current workaround of this situation is to override restore_object(), which is kind of stupid because the restore_object() method of ModelSerializer handles different relations in a neat way. I am still looking for a better solution for this case.

A mysterious case

One more thing that I haven't figure out is in this example:

class BookSerializer(serializer.Serializer):  
    subject = serializers.CharField(max_length=1024)

    def restore_object(self, attrs, instance=None):
        if instance is None:
            return Book(**attrs)
        return instance

    class Meta:
        fields = ('id', 'title', 'subject', )

The code above will emit a KeyError: 'id' error. Still figuring this one, but any insights would be appreciated. * ^ - ^ *

comments powered by Disqus