Induced Pluripotency as a Benchmark for Differentiable Cell Simulators

Induced Pluripotency is one of the more exciting recent discoveries in cellular biology. In essence, adult cells can be introduced to a set of transcription factors (“repogramming factors”) which induce the adult cell to revert to becoming a stem cell. The cascades involved here are complex. The discoverers of iPSCs (induced pluripotent stem cells) won a Nobel prize for their breakthrough in 2012.

I’ve suggested in a previous post that the creation of differentiable cell simulators would prove to be a powerful new primitive for computational biology. (Here differentiation refers to the mathematical operation and not biological cell differentiation.) In my mental model, a powerful differentiable simulator would provide a high fidelity computational version of a cell. This model should meet a number of desirable properties. It should respond to “experimental measurements” in the same way as the real cell would. For example, it might be possible to use a generative deep model to create “microscopic images” of the differentiable cell. This would allow experimenters to probe the internals of the simulated cell at will. The model will of course need to be trained using real microscopy images of cells in order for it to learn the needed structures, but the advent of large cellular datasets means this task doesn’t seem infeasible.

What does this have to do with induced pluripotency though? Well, if the differentiable cell really is a high fidelity model of a real cell, it ought to be possible to induce pluripotency in the differentiable cell just as it is possible in the real cell. There are a whole host of factors involved in this basic idea though. There’s of course a dumb way to induce pluripotency in a simulated cell: have a switch statement that changes the “cell type” upon probing. This doesn’t teach us much. In order for the induced pluripotency to be meaningful, there would need to be a realistic cellular environment available to the differentiable simulator. Not all transcription factors can reprogram a cell of course, so the differentiable simulator should have a meaningful way of responding based on the actual introduced factor. Ideally, the mechanism of induced pluripotency should be represented in a learned fashion rather than being hardcoded, but this might prove considerably challenging to do.

This idea of inducing pluripotency in a simulation isn’t a “benchmark” in the usual machine learning sense. We’re not measuring accuracies on a held-out test set. But rather, we’re checking that we can induce an experimentally known phenomenon in a reasonable fashion within the learned model. This is a broad test of generalizability that would prove that the differentiable cell had learned powerful internal representations that might serve as a useful resource for further scientific investigation.

Acknowledgements: Thanks to Sandya Subramanian for useful discussions.