Minimizing Shared State with Swift Value Types

iOS Developer @ Chrono24
Minimizing Shared State with Swift Value Types

The Chrono24 iOS app existed back in the Objective-C days and has come a long way since. Back then, we were pretty much forced to inherit from NSObject and therefore use reference types for almost everything. With the introduction of Swift in 2014 and its value types, a whole new world of possibilities opened up.

In this article we talk about how value types in Swift allow us to write safer code by minimizing shared state, making the code much easier to reason about than before.

 

The Problem with Reference Types

Instances of classes have an inherent identity, the object identity: an address on the heap where their data is stored. When passing an object as an argument to some function or assigning it to some variable, we really only copy a pointer to where the object lives on the heap. The object is said to be passed by reference. There is still only one object living on the heap but multiple pointers referencing it.

In many situations, this is desirable: we only store the potentially expensive object data once and can have as many cheap pointers to it as we’d like. If the object is immutable, i.e. its data doesn’t change over time, this is great.

But not everything can be immutable. Maybe we need mutability because we cannot configure everything directly in the initializer. Or maybe our use case warrants mutable classes, e.g. a class modeling a conversation that can be marked as read by the user. When we reference instances of mutable classes from different parts of our application, we end up having so-called shared mutable state. That’s potentially dangerous because suddenly one part of the application can mutate the object without the other parts being notified of this change, simply reading the updated data the next time the object is accessed whether they want it or not. In addition to that, every part referencing shared mutable state now has the ability to mutate it even though there is generally some kind of owner managing the data with the other parts merely consuming the data.

If we don’t actually want to have shared mutable state, we can make defensive copies of our model objects when passing them to another part of the application, such that changes made in one part aren’t visible in the other part. When making defensive copies, a shallow copy is generally not sufficient as the object may itself contain a mutable object. Instead we may need to perform a deep copy, copying the primary object’s contents as well as recursively contained mutable objects. But then again, that would probably be a mutable copy and other parts of the application could mutate it without us knowing if they can somehow get a reference to it. To be really safe, we would need two versions of a data type — a mutable and an immutable version — such that we can pass an immutable copy off to another part of the application.

When different parts of the application intentionally share mutable state, we need to make sure that when one part modifies the shared state, the other parts get a chance to react accordingly. We can do this by implementing the observer pattern for our mutable classes. To correctly implement the observer pattern, a mutable object must observe all of its referenced objects to notify its own observers if one of the referenced objects changed. And if one of its references changes to point to a new instance, it must not forget to unsubscribe from the old instance and subscribe to the new instance.

Unfortunately immutable types can very quickly turn into mutable types due to new requirements, and then we have a real problem: Up until now we correctly assumed the type to be immutable in code, not making defensive copies or registering as an observer where we would otherwise have for a mutable data type. All these parts need to be updated to account for the class no longer being immutable. It may not be easy to find all the affected parts of the code. We must also consider objects that appear to be immutable at first glance, but contain immutable references to mutable objects: once a deeply nested type becomes mutable, so do all of the types recursively referencing it. And then there also is the possibility of someone slipping us a mutable subclass of an immutable superclass without us knowing…

 

How Value Types Save the Day

Value types don’t have an inherent identity as reference types do. All they have is content, and if that content is equal to the content of another value then they are considered equivalent and substitutable. Value types are passed by value: when passed along to another part of the program, their content is copied, not just the address of the source value. As a result, instances of value types are not shared — everyone automatically gets his own defensive copy and can mutate it freely without affecting the rest of the application. This allows us to write more predictable code that is easier to reason about. Of course value types live somewhere in memory too and therefore have an address, but we generally don’t see or deal with this address.

The mutability of a value type is not baked into the type itself. Instead, the declaration site gets to choose whether the instance should be immutable (using let) or mutable (using var). If declared as immutable, neither the value itself nor nested values can be reassigned or have mutating methods invoked, even if the nested values are defined as mutable within their container — after all, the whole thing is declared to be constant. If declared as mutable on the other hand, all of the above operations are permitted. Note that there is no difference between mutating an existing value or assigning an equivalent new value since value types don’t have an inherent identity as reference types do. With just one implementation of the type, we get both an immutable and a mutable variant, with the mutable parts simply being unavailable when used as an immutable type.

Also note that this is by no means exclusive to Swift. In fact, it can be done in plain C with just structs and const. Swift does add a couple of niceties such as support for methods and reference counting for nested reference types though.

 

Value Types and Identities

Identities are crucial to distinguishing things from one another, and classes give us the object identity for free. Let’s imagine we have a list of watches and each watch is being represented by a cell in a UITableView. We can use the list of watches to feed the table view, and when a new list of watches is set, we can tell where cells need to be inserted, deleted, or moved based on the watch identities.

Except object identity isn’t what we want here: When we get an updated set of watches from an API endpoint and want to present those to the user, after deserialization we have objects with all new identities. Our update logic based on object identities would then simply remove all of the old cells and create new cells for all of the watches.

So what kind of identity do we want here? In the example above, the objects after the update still represent the same thing: the same underlying records in our database. In a database we generally have a primary key that defines the identity of the records we are dealing with; something like a watchID for the example above. This key alone defines the record identity of a watch. And we can have multiple snapshots of the same record at different points in time (say before and after an update) and associate them using their record identity.

Of course this is possible with reference types, too. But value types make this really easy because we can ensure that the snapshots are immutable and because they don’t bring this other kind of identity with them, that — in our use case — has no meaning whatsoever.

SwiftUI introduced the Identifiable protocol to allow types to vend their record identity and several UI components that build upon the concept of record identity. Soon after its introduction the protocol was moved to the standard library for everyone to use.

protocol Identifiable {
    associatedtype ID: Hashable
    var id: ID { get }
}

 

Observing Changes

Implementing the observer pattern for (nested) mutable classes is very error-prone, and even if implemented correctly there are a couple of drawbacks: Firstly, mutations are communicated in a very fine-grained fashion. Multiple consecutive mutations notify the observer for each mutation individually, even though we are much rather interested in the change as a whole. And secondly, because we generally don’t know what part of the object changed, we need to assume the worst and update everything that depends on any part of the object.

For value types, on the other hand, we don’t need to implement observation of this kind. Recall that for value types, the declaration site defines the mutability of a value. Immutable values simply aren’t allowed to change and mutable values can only change as a whole, i.e. mutations to the value itself or one of its nested values result in a mutation of the entire value, causing a potential didSet property observer to be called. This way we can perform mutations of arbitrary scale or even assign an entirely new value and have the property observer be called just once, plus we have access to the oldValue to find out what part of the data actually changed and perform our resulting updates more efficiently.

 

Value Semantics

Not everything can be a value type. Sometimes we need the unique identity the address of a value gives us. Sometimes the size of a type changes dynamically, requiring reallocation, such as arrays that have elements removed or inserted. And value types can even behave like reference types, for example if they reference shared mutable storage such as global variables or if they contain mutable objects.

What really matters is what kind of semantics a type has; what guarantees it makes, how it behaves. In the standard library, even collections have value semantics: an array, a dictionary, or a set is just a pointer to (potentially shared) mutable storage on the heap, but these types implement an efficient copy-on-write mechanism that copies the underlying storage prior to mutation if it is referenced more than once.

 

Observation and Diffing in SwiftUI

In SwiftUI, the body of a view is invalidated when the value of a Binding or ObservableObject that is read in the body changes. On the next run loop, the view bodies are then being re-rendered and the resulting virtual view tree is diffed to find out what changes need to be applied to the view hierarchy. This is implemented as an invalidation instead of a direct reload such that multiple consecutive updates can be grouped together, something the observer pattern as implemented by ObservableObject does not allow for.

The content of dynamic lists is diffed according to the record identity of the Identifiable model values to find out where cells need to be inserted, deleted, or moved.

 

Value Types at Chrono24

While we do not currently use SwiftUI, there are a couple of lessons to be learned by how things work there, namely data flow and how updates are performed.

In the Chrono24 iOS application, we generally want to avoid shared mutable state. We therefore use value types for most of our data. Our data generally has an owner, either a service or a controller, who manages it and passes snapshots of the data to other components that need it, primarily views. The views can therefore not mutate the data directly and must instead send actions back to the controller to request changes. The controller can then handle the action, possibly mutate the data as a result, and write (a snapshot of) the updated data back to the view and all other parts that require the updated data.

Building on top of that, we implement the Identifiable and Equatable protocols in our model values all throughout the app in order to allow for efficient updates of the view hierarchy once the underlying data changes: Just by extracting the primary keys of our models and comparing them to those before the update, we can tell where items need to be inserted, deleted, or moved. For snapshots of records whose identity is present in both the old and the new data, we can then check if there have been changes to the rest of the record using its Equatable conformance. If there are, we pass the new snapshot to the respective view so that it can redraw; if there aren’t, then the two snapshots are in fact substitutable and we don’t need to do any extra work.

You can find a sample project showing the problems with reference types and how we use value types at Chrono24 to solve common tasks when developing an iOS app on GitHub.

In conclusion, these are our biggest takeaways:

  • By using value types for our model types we can reduce shared mutable state and observation bookkeeping logic.
  • When using value types for your models, think of them as snapshots of a record at different points in time and define their record identity in terms of the primary key in the underlying database.
  • Value types allow us to perform UI updates more efficiently: we can trivially compose multiple updates together and perform only a single efficient redraw because we know both the old and the new value and can find out which parts do (or don’t) need to be redrawn.

Bildquellen

Tags
swift