Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Seems a bit of conflation of notions going on here: Just because logic/quantum/probabilistic all end with "programming", it doesn't mean that they all are what is called a "programming paradigm" in the typical sense [0]. Where does a paradigm end and a library implementation begin?

For example, varying how a program arrives at the answer induces impertive/OOP vs. declarative paradigms [0]. On the other hand, quantum programming assumes a radically different type of "CPU" (i.e. instruction set) on which your program runs - which in turn obviously changes everything. But this can safely by implemented within existing paradigms, e.g. ProjectQ [1] is implemented in an OOP language: Python. Thus, I would not call this a new programming paradigm (unfortunately [0] does that, but I think that is bad form).

> you can view differentiable programming as extending logic programming [...] that allows interpolation between data and code.

Reference? I have browsed one of the standard references [2] on automatic differentiation and googled a bit and could not find something that supports your statement. Even more so, is seems that even defining semantics for differential programming is barely in its starting stages [3].

> differentiable programming is at the level of syntax sugar for reverse mode differentiation, so I can't blame you for that conclusion.

By the same argument you could say that probabilistic programming is just syntax sugar for painless specification of statistical models.

Care to provide a reference where differential programming is presented as something more than "syntax sugar"?

[0] https://en.wikipedia.org/wiki/Programming_paradigm#Further_p...

[1] https://projectq.ch/

[2] Griewank A., Walther A., Evaluating Derivatives, SIAM 2008

[3] https://arxiv.org/pdf/1911.04523.pdf



The existence of a different kind of CPU isn't a meaningful distinction at the level of discussing paradigms. The semantics are different, so the abstract machine is different. The fact that I need a different set of atoms in my desktop to use it doesn't change the programming language part of the discussion.

The main paper to read is [1] which introduces a syntactic notion of differentiation in the lambda calculus connecting substitution and nondeterministic choice to differentiation in the calculus of infinitesimals sense and also introduces a meaningful notion of Taylor expansion of arbitrary programs. This paper is mostly of academic interest, though. The resulting expansion is wildly uncomputable meaning that more modern, practical papers like [2] cite it wistfully as a dream of what could be achieved. How to computably handle most of the constructs we care about in a general programming sense is very active, open research. At the time the paper was introduced, it was more influential on (and influenced by) work on probabilistic and quantum programming through their related models of linear logic [3]. There are only a few slight axiom differences that separate differential, logic, probabilistic, and quantum programming though, so if you're willing to accept one as a "paradigm", then you should accept the others.

[1] https://www.sciencedirect.com/science/article/pii/S030439750... [2] https://arxiv.org/abs/1911.04523 [3] https://ncatlab.org/nlab/show/differential%20category


It seems you haven't read my references? As your [2] is my [3] from above!

> The existence of a different kind of CPU isn't a meaningful distinction at the level of discussing paradigms.

Well, that was my point above: You can't really lump quantum programming together with probabilistic programming, as they are paradigms on different "levels".

> practical papers like [2] cite it wistfully as a dream of what could be achieved

Are you sure about that? I skimmed [1] as I wasn't hadn't read it and it seems to describe a rather restricted set of functions ("types are interpreted as vector spaces and terms as functions defined by power series on these spaces"), as there are many differentiable functions that cannot be defined as power series.

Moreover, in [2] it is only claimed: "Ehrhard and Regnier {i.e. your reference [1]} do not give an operational semantics but they do give rules for symbolic differentiation and it should not be too difficult to use them to give an operational semantics. However their language with its convenient vector space semantics only supports total functions. It therefore cannot be extended to include recursive function definitions or conditionals (even with total predicates, as continuous functions from R^n to the booleans are constant)." So I would not say they cite [1] as a wistful dream ...

> There are only a few slight axiom differences that separate differential, logic, probabilistic, and quantum programming though.

Give me the axioms and their differences and I believe you. :) (Honestly, I'm not even sure if the discussion on which axiomatization captures the existing developments has been settled; it seems to me you have some kind of category theoretic approach in mind where you just change the category and get a new paradigm - I'd be happy to accept this as well, if there is a clear reference, though I'm doubtful one exists ...)


I don't usually respond to old comments, so I don't know if you'll read this, but I hope I can encourage you to think more broadly about what "differentiable programming" means.

Different fields have a different perspective on the same set of tools because those tools have different pathological cases in different areas. Context really matters.

> Well, that was my point above: You can't really lump quantum programming together with probabilistic programming, as they are paradigms on different "levels"

This distinction is not useful. If I write in a functional or logic programming language, it gets translated into imperative commands for an underlying architecture that is some mix of dataflow, event driven, automata-based, concurrent, etc... that is then further built on top of some physical atoms where an engineer worried about quantum effects. If I write in a quantum programming language, it will probably go through the same process for at least another 5 years. You might argue that quantum is somehow more dictated by the underlying physical model the way that people argue that imperative programming is closer to the physical world than functional programming. But the "level" doesn't change the usefulness of viewing all of these as "paradigms" worthy of study and analysis on their own terms with their own tools. At the level of studying a programming language, the "level" is a useful thing to be aware of for implementations and motivation but usually not for a theory.

> Even more so, is seems that even defining semantics for differential programming is barely in its starting stages

This is also not a useful distinction. OOP was also, infamously, a point of contention between the academic and outside worlds because it was developed and incredibly prevalent without a rigorous theory abstracting it beyond procedural programming. It became a "paradigm" despite that because there was a set of (informal) tools for reasoning about it on its own terms [1].

Likewise, differentiable programming has largely developed to formalize what makes programs written for machine learning frameworks different from programs written in the imperative/object oriented/functional language they are built on. Autodiff has mostly developed in practical usage, so the use cases are front-running the theory. There's increasingly hardware tailored to the execution model and software developers attempting to program it. There are approaches to problems like discontinuities that people have found solutions for without a rigorous theory justifying their use. There's a structure to why and how people are writing code for these applications as well as an operational theory for how to reason about it, but there's very little compositional, equational theory for these choices.

To most people in machine learning, "differentiable programming" is just autodiff with pretty syntax because the term sprung from attempting to put what they are already trying to accomplish with that implementation on more solid theoretical footing as a computable model of a more general logic. That, hopefully, lets us more efficiently explore what a better domain-specific theory might be and if there are better execution models or logical frameworks. Autodiff itself is increasingly used as an umbrella term for many other methods of differentiation with different edge cases, so this is already happening in practice.

To reduce "differentiable programming" to just its implementation ignores that aspect. It would be equivalent to equating machine learning to matrices. Not unreasonable computationally and not a terrible place to start for a theory, but deeply unsatisfying as a mature domain-specific theory.

The main paper I linked [2] is not about autodiff at all. It's an attempt to establish a connection between differentiation in an analysis sense to models of (not otherwise obviously differentiable) program evaluation. The (unrealized) promise is that the centuries of understanding we have for the calculus of infinitesimals can be applied to the less-mature study of lambda calculus and nondeterministic computation. Papers like [3] cite it because it addresses (discrete) structures that analysis is less interested in and potentially provides a way to connect computation, calculus, and whatever it is that we're doing with machine learning.

> Are you sure about that? I skimmed [1] as I wasn't hadn't read it and it seems to describe a rather restricted set of functions ("types are interpreted as vector spaces and terms as functions defined by power series on these spaces"), as there are many differentiable functions that cannot be defined as power series.

Because it's a PL theory paper, it's not concerned with whether all differentiable functions can be represented, but whether all computable functions can be differentiated. And PL theorists are generally more comfortable than most to accept that most functions cannot be computed and choose a more restrictive model that enables more reasoning power. The category [4] is really the better place to start since it lets us also consider models that aren't vector spaces and [2] is best thought of as a prototype that left many gaps in the theory (for example the requirement on coefficients for convergence is wrong, though I can't remember which paper by Lionel Vaux proved this). It can be thought of as computable in finite instances, but it's unsound even when typed due to the zero term and result of sums.

The quote you cite from [3] is easily misunderstood without that context. As a practice-focused paper, it cares very much about computability. Conditionals and loops are possible in [2] since it allows church numerals and fixed point combinators but it introduces a nondeterministic sum which is exponential in the number of evaluation steps and may diverge (doubly so since it's the untyped lambda calculus...) and is difficult to operationalize. That's what I meant by "wildly uncomputable". So, to them, [2] offers a useful mental framework for higher order features, but is not practical. The theory isn't there yet.

The connections between logic, quantum, probabilistic, and differentiable programming can be understood by how the model treats the exponential modality (!) which converts the otherwise linear term to an analytic one. Differentiation decomposes this to give a sum of linear terms. Differential lambda calculus doesn't put any (more) structure on the sum. Probabilistic programming gives the added structure of a probabilistic sum where coefficients are weights. Quantum programming can be modeled via a Fock space [5] for (!) ([5] predates [2] so is not directly discussed as a differential category here). However, it's unclear what the right model for differentiable programming should be if we want something practical for the resulting derivative (much less antiderivative). Daniel Murfet et al [6] have some related work more directly in the context of machine learning.

[1] https://www.cs.cmu.edu/~aldrich/papers/objects-essay.pdf [2] https://www.sciencedirect.com/science/article/pii/S030439750... [3] https://arxiv.org/abs/1911.04523 [4] https://ncatlab.org/nlab/show/differential%20category [5] https://www.researchgate.net/publication/2351750_Fock_Space_... (sadly a researchgate link) [6] http://therisingsea.org


> don't know if you'll read this

I did read it ;) Because I'm very much interested in this entire topic.

> I hope I can encourage you to think more broadly about what "differentiable programming" means

I'm trying to, but I find it hard. My stance was that differentiable programming seemes like this theory for which only a single example (namely autodiff) existed, as you also said ("autodiff has mostly developed in practical usage, so the use cases are front-running the theory"). But this entire comment of yours really clarified some things for me.

> The main paper I linked [2] is not about autodiff at all. >The quote you cite from [3] is easily misunderstood without that context

You finally convinced me to have a detailed look at this. Thank you for providing the context.

> Conditionals and loops are possible in [2] since it allows church numerals and fixed point combinators but it introduces a nondeterministic sum [...] and is difficult to operationalize. That's what I meant by "wildly uncomputable".

I think I may have misunderstood some of your previous comments (and perhaps vice versa) as it now dawns on me that you use a vocabulary than comes (I guess?) from PL theory and is very different from the one I'm used to, as a mathematician versed in analysis. I'll re-read them.

> Daniel Murfet et al [6] have some related work more directly in the context of machine learning.

I'm actually aware of Daniel Murfet but haven't read his work from last years. Did you have a specific paper from him in mind?


P.S. (In case you return once more to read comments.)

It seems like these are problems that could benefit from a PL-theory <-> mathematical analysis cross-collaboration.

I would have sent you a private message via email, but couldn't find any info on your HN profile; my email is there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: