Last time I presented the idea that version control and delimited continuations are related. I left off with a question how how to make Darcs fit this model. I think I understand now what I was missing.
I forgot to think about Darcs operations in terms of the intermediate operations that get performed. In Darcs, everything is based on commuting patches, even merging. Therefore, to see how Darcs fits into this model it's important to think about commuting patches in terms of delimited continuations.
Specifically, I now believe that commuting two patches introduces marks that can be shifted to later.
I have several ideas for the next steps of this. One is to start modeling toy versions of svn and darcs in Haskell via delimited continuations. After that, I would like to figure out the correspondence between the delimited continuations that I've identified and their data structure reification as zippers. I hope to have more about that later.
Judging by a paper written by Oleg, there should be a natural way to convert the delimited continuation representation into a zipper. Investigating this model might shed light on the Darcs patch model, or even lead to a more concise formalism.
06 June 2010
Delimited Continuations and Version Control
Delimited continuations give us a way to create markers that we can jump back to. We can construct the future of the computation, work with the computation so far, or abort the current continuation and go down a new path (create a new future).
These primitives have a natural correspondence with version control systems that snapshot the world like
SVN. Focusing on SVN for a moment:
These primitives have a natural correspondence with version control systems that snapshot the world like
SVN. Focusing on SVN for a moment:
- The current continuation is the transformation of repository state that you're working on. It's the diff you're creating.
- For commits, each revision created by commit is a marker, so we model this with reset.
- Checking out an older revision, or reverting changes, corresponds to a shift. We discard the current continuation and move back to a marker created by a specific reset.
- Updating consists of having the client copy learn the current state of the continuation on the server and applying it to the local copy.
- Starting a branch corresponds to a reset.
- Merging two branches is a bit trickier I suspect. I haven't worked out all the details sufficiently to convince myself I have it right, but here is how I think this case works. The merge first shifts to each of the markers and then combines those two continuations into one future. The part that seems weird to me about this, is that I haven't really seen examples of delimited continuations were the continuation of two different markers (prompts) were combined.
Now, if you accept the above it gives us some intuition to build on. Although my correspondence is terribly informal at the moment, if we took some time to make it formal by working out enough details to model it in, say, Haskell, then we'd have a nice formal backing for how SVN works. I think the above model applies equally well to git, but I'm not confident with git's model.
One insight the above gives, is that the way merging works is not described by the continuations in general. It's up to the exact combining function to determine the merge. We know that an automatic merge can fail in practice due to things like conflicts between the changes in the branches. So, in the SVN implementation considerable work has gone into implementing logic for creating the proper continuation.
Now, in general when a merge, or update, is performed human intervention may be required. This typically happens when the changes are in conflict. What this means for our model is that in general the creation of the continuation requires knowledge outside of our model. What does that correspond to? Well, it's essentially saying that calculating the correct continuation to resolve the merge is non-deterministic!
In other words, this continuation view of version control gives us a rigorous way to talk about our intuition. Of course we can easily tell that merging is going to require human intervention without needing to study delimited continuations, but this framework of reasoning now gives us a more mathematical way to say it.
The next question is: How does the delimited continuation model of vcs apply to Darcs?
So far I'm not sure. I suspect, without working through the details, that in Darcs, every time you record a patch multiple resets happen, instead of just one. The model really breaks down for Darcs because you can seemingly visit points in "history" that did not exist when the patches were created, but they are valid repository states.
For example, imagine a repository that only exists locally. You create a sequence of patches. Now, take a patch in the middle that can be commuted to the end of the patch sequence. Doing so has not created a new repository state; so far this is fine with the above model as no new markers need to be created. Suppose we remove the patch from the end of the patch sequence. This is exactly how darcs unpull works. The funny thing is, we've now created a state that never existed previously. So in the delimited continuation model, what marker did we just shift to?
I don't know the answer yet, but I think it's an interesting question. I suspect there are multiple "correct" answers, but that only some answers will yield elegant and robust models here.
05 June 2010
Reviving my technical blog
My blog has been missing for quite some time. I think about a year to be honest. It's time to start it back up. Hopefully I'll have something interesting to say soon :)
I've imported all my old blog posts (typos and all!). I had to do the imports by hand from google reader. Not exactly fun, but much easier than the alternatives. I've tagged each of them with 'imported' so that's why that tag is my most used tag at the moment :)
I've imported all my old blog posts (typos and all!). I had to do the imports by hand from google reader. Not exactly fun, but much easier than the alternatives. I've tagged each of them with 'imported' so that's why that tag is my most used tag at the moment :)
Subscribe to:
Posts (Atom)