Syncing for TimeTag and Capparsa's apps
Syncing for TimeTag has been a long requested feature--and something I have wanted myself for the past four years.
I've written before about why syncing is such a challenge...and it still remains a challenge today. Tons of other developers of apps have written about why it was so difficult to make their apps sync with one another. Unfortunately, while the complexity hasn't gone down, the requirement for apps to sync has gone up.
People are on the go, and own way more devices than they did in the past. They want everything to be "as they left it", and without having to put too much thought into it. I agree with this--I'm upset when I have an app that I love that works in a silo from other copies of it.
So I finally sat down and got to work: I was determined to come up with a good syncing formula for TimeTag for the iPhone, iPad and Mac.
The database isn’t too complicated--in fact it’s pretty simple. It has time records, tags and categories. It always seemed like it should be easy enough to sync, but there were always tons of issues that popped up. Issues like:
Which record would win in the event of a conflict?
What happens if the user’s network is down?
How do you associate all the right records with tags, and the right tags with categories? (What happens if a tag is deleted, or a category changes color?)
How do you deal with deleting objects, and not allowing zombies to come back? (infinitely re-generating items)
Things like that were always an issue. The very first attempts used a kind of looping algorithm that Compositions uses today. It goes something like this:
Pull down everything from the Internet
Get everything from the local database
Loop through all the online documents, and check for a matching local one. Figure out what’s updated, or if the local document is missing, and if it is missing, figure out why (like maybe it was deleted?)
Loop through all the offline documents, and check for matching online ones. If there is no matching online one, figure out if we need to upload it (or maybe it should be deleted?)
Loop through any conflicted documents that raised as a result of those two loops and do some final cleanup
As you can see, it’s a real mess. It works, but wow it’s buggy and prone to error. Today it’s pretty stable, but it wasn’t without months of work and a few upset customers letting me know a document was lost.
So that was the approach I wanted to take with TimeTag, but big surprise, it proved really prone to error. I was getting infinitely duplicating time records, tags that wouldn’t upset but instead create new copies of themselves, and zombie records (after deleting them, they’d re-appear everywhere due to how deleting/syncing works).
I needed a new way to do it. I still liked the idea of comparing local to online, and taking action on that, but I needed a much cleaner way to do it. Then the idea hit me: Why not create a wrapper object?
Since every record has what I call a “universal ID” that is supposed to be the same across every single device (and the cloud), I could use this as the way to match up records of any type--whether it’s a time record, tag or category.
I created a wrapper class, and did the following:
Go online and pull down all the online objects of type X, and stick them in a newly created wrapper of X
Pull the local copies of objects of type X, and attach them to the matching wrapper of X if it exists already (or create a new copy if it didn’t)
Iterate through wrappers of X
When I iterate through wrappers of X, one of three things can be true:
Both the online and local X objects are there
Just the online version is there
Just the local version is there
If 1) is true, then I need to check edited dates and sync status markers, and update both as appropriate.
If 2) is true, then I need to update or create the local version
If 3) is true, then I need to create the online version
This proved to be a very reproducible algorithm. Once I got it working with categories, i was able to get it working with tags and then with records. Even better, as I began to reproduce the code, I was able to make it more generic and refactor a lot more of the code.
What I have now is a really cool syncing engine that uses Parse as a backend. It will update properly, login a new user and get everything just fine, and stay in sync with one another. It uses a model of “last in wins”, so whoever edited last will win out. Not always ideal, but it keeps the syncing model clean, and as long as users are aware of it, it should cover almost all cases/scenarios. As time goes on, I’ll definitely work on making it more robust to deal with conflicts.
What to do about deleted objects? Well, what I decided was the the best way to know something was deleted was to actually not delete it--but instead mark it as deleted. Then every now and then I can do a cleanup of old deleted records so that they don’t sit around taking up space for no reason. The benefit though is that you can now check if a record is deleted (because it says: Deleted) and remove it from devices. This makes it totally unambiguous that the user intended to delete a record, so you won’t ever incorrectly remove something that shouldn’t have been removed...and you won’t get re-created objects!
That’s all the detail I have for now. I need to get back to testing and implementing. I hope to have TimeTag sync shipping this month--first for iPhone and iPad, and then for Macs. Stay tuned for more updates!