Proto DataStore (and Protocol Buffers intro)

Proto DataStore (and Protocol Buffers intro)

Storing small bits of data was traditionally a job for SharedPreferences in the Android development world. This changed with the introduction of Jetpack DataStore, the new way of storing small data aiming at replacing the usage of the aging SharedPreferences API.

Traditionally, the type of data was not saved somewhere. The caller would need to remember what type of data corresponded to each key. For instance, if you initially saved a movie_id key with an Int value and then (after forgetting the value type you saved) tried to fetch that as a String, a runtime error would occur.

The DataStore library comes into 2 "flavors". The Preferences DataStore  resembles more closely the existing key-value SharedPreferecnes approach by solving other shortcomings of the old API, but not the type-safety.

DataStore Preferences and migrating from SharedPreferences
Jetpack DataStore is a convenient modern way of storing small bits of data (and replace the aging SharedPreferences).

The second "flavor" is the Proto DataStore that solves this type-safety issue using Protobuffers.

Protocol Buffers

Protocol Buffers (or protobuffers or protobufs in short) is a way of serializing data. You declare a "schema" and then the protobuf compiler creates classes (similar to  Kotlin Data classes) in different languages that you can use to serialize/deserialize your data.

message User { //  [1]
  string name = 1; // [2]
  int32 id = 2;
  bool is_registered = 3; // [3]
}

For instance, the above schema compiled to Java will create a class that can be instantiated like this.

User tom = User.newBuilder()
    .setId(1234)
    .setName("Tom Brown")
    .setIsRegistered(false)
    .build();

The builder and the immutable classes were created by the compiler from that schema with no additional input! Note that this kind of classes can be created for any language that the proto compiler supports. Some comments on the schema:

  1. Each class to be generated starts with a message keyword. You can have multiple messages in a single file.
  2. Protobuffs have types that correspond to different native types in each language they support. Take a look at this table. The types in the example are self-explained to which type they correspond to Java/Kotlin.
  3. The numbers in the schema are called field numbers. This the order in which each field will be serialized. You can freely change the names of the field but never the field number since it will break backward-compatibility. Never change these after you start using them in production.

The protocol buffers have an entire syntax on their own and it's outside the scope of this post to fully explain them. This was just a quick intro to the syntax and their usage. Take a look at the official doc (tutorial, doc) for more info. They are extremely flexible (e.g. support for required fields, repeatable fields, etc).

Set up

To install the library and the protobuf compiler you would need to change your build.gradle files a bit more than usual.

buildscript {
  [...]
  
  repositories {
    [...]
    mavenCentral()
  }
  
  dependencies {
    [...]
    classpath 'com.google.protobuf:protobuf-gradle-plugin:0.8.13'
  }
}
build.gradle (project)
apply plugin: 'com.google.protobuf'

dependencies {
    [...]
    implementation  "androidx.datastore:datastore-core:1.0.0-alpha01"
    implementation  "com.google.protobuf:protobuf-javalite:3.10.0"
}

protobuf {
    protoc {
        artifact = "com.google.protobuf:protoc:3.10.0"
    }

    generateProtoTasks {
        all().each { task ->
            task.builtins {
                java {
                    option 'lite'
                }
            }
        }
    }
}
build.gradle (module)

Check out the official instructions for installing the latest versions of Protocol Buffers and Proto DataStore.

Proto file

First, you need to create the proto file - the schema for your data. Create a new directory called proto located in app/src/main/proto. Then create a file called user.proto.

syntax = "proto3";

option java_package = "com.example.protodatastore";
option java_multiple_files = true;

message User {
  string name = 1;
  int32 id = 2;
  bool is_registered = 3;
}
user.proto

For the protobuf compiler to create the classes defined in the proto file, you need to build the project using Build -> Make Project. After the build is complete, you will have a com.example.protodatastore.User class that you can use. Another link to the Proto syntax doc.

The next step is to create a serializer/deserializer for your proto. This will take a stream of bytes and create a User instance (deserializer), or take a User instance and create as a stream of bytes (serializer). Most of this class is boilerplate since the actual job is done by the class generated by the protobuf compiler.

object UserSerializer : Serializer<User> {
    override fun readFrom(input: InputStream): User {
        try {
            return User.parseFrom(input)
        } catch (exception: InvalidProtocolBufferException) {
            throw CorruptionException("Error deserializing proto", exception)
        }
    }

    override fun writeTo(t: User, output: OutputStream) = t.writeTo(output)
}
User serializer / deserializer

Reading

Firstly, create a DataStore instance referring a filename in which your data will be stored, and the serializer we created earlier.

private val dataStore: DataStore<User> =
    context.createDataStore(
        fileName = "user.pb",
        serializer = UserSerializer)

Then read using this instance, while handling what will happen in case of something going wrong (DataStore throws IOException when something goes wrong).

val userFlow: Flow<User> = dataStore.data
    .catch { exception ->
        if (exception is IOException) {
            emit(User.getDefaultInstance())
        } else {
            throw exception
        }
    }

To use this value in your ViewModel you can call userFlow.asLiveData() or userFlow.collect(). Read this for a quick intro into Flow.

Into the Flow: Kotlin cold streams primer
When I was about to start a new Android project I decided it was finally time to look into Kotlin Flows. Maybe everybody was talking about them for a reason.

Writing

Use the suspending updateData() method that we get as parameter the current state of User. The date are updated transactionally in an atomic read-write-modify operation.

suspend fun updateIsRegistered(isRegistered: Boolean) {
    dataStore.updateData { user ->
        user.toBuilder().setIsRegistered(isRegistered).build()
    }
}

Migrating

When creating the DataStore instance you can provide a mapping between your existing SharedPreferences and your new Proto DataStore. This will run once and after the migration is completed you can only use the Proto DataStore.

private val sharedPrefsMigration = SharedPreferencesMigration(
    context,
    OLD_USER_SHARED_PREFERENCES_NAME
) { sharedPrefs: SharedPreferencesView, currentData: User ->
    currentData.toBuilder()
        .setName(sharedPrefs.getString(NAME_KEY, ""))
        .setId(sharedPrefs.getInt(ID_KEY, -1))
        .setIsRegistered(sharedPrefs.getBoolean(IS_REGISTERED_KEY, false))
        .build()
    
private val dataStore: DataStore<User> = context.createDataStore(
    fileName = "user.pb",
    serializer = UserSerializer,
    migrations = listOf(sharedPrefsMigration)
)
Migrating from SharedPreferences

Hopefully, this was a super quick intro into the Proto DataStore and the protocol buffer syntax. It needs some boilerplate and has an initial set up cost, but the type-safety it offers (and the lack of keys maintenance) I think worths it. For more details check the excellent documentation, code lab, or official blog post.

Show Comments