I still don’t understand what "at constant something" means. I mean formally, mathematically, in a way where I don’t have to kinda guess what the result may be and rely on my poor intuitions and shoot myself continually in the foot in the process.
I assume you're talking about thermodynamics - this comes down to a slight abuse of notation. For an ideal gas, say, you can express various state functions like the internal energy in various different ways. You can do it in terms of pressure P and volume V to get U ~ PV, for instance.
Or you could do it in terms of temperature T and pressure, for instance, to obtain U ~ T (in this case there's no dependence on pressure).
The ideal gas laws let you transform between these choices. But the point is that the same physical quantity, U, has multiple mathematical functions underlying it - depending on which pair you choose to describe it with!
To disambiguate this physicists write stuff like (dU/dP)_T, which means "partial derivative of U wrt P, where we use the expression for U in terms of P and T". Note that this is not the same as (dU/dP)_V, despite the fact that it superficially looks like the same derivative! The former is 0 and the latter is ~V, which you can compute from the expressions I gave above.
The mistake is thinking that U is a single function of many independent variables P, T, S, V, etc. Actually these variables all depend on each other! So there are many possible functions corresponding to U in a formal sense, which is something people gloss over because U is a single physical quantity and it's convenient to use a single letter to denote it.
Maybe it would make more sense to use notation like U(T, P) and U(P, V) to make it clear that these are different functions, if you wanted to be super explicit.
> The mistake is thinking that U is a single function of many independent variables P, T, S, V, etc. Actually these variables all depend on each other!
So, in vector space terms, we have different bases for describing U in, but not that many independent variables.
If U is a function of x and y, but x and y are not orthogonal, then I can't treat dU/dx and dU/dy as independent, even for partial derivatives, because x and y aren't really independent.
You're not, in general, just working in a vector space but on a manifold whose coordinates are your extensive variables. It's only linear locally, in the (co-)tangent space where you're doing calculus.
Yeah, I think this is along the right lines - in the vector space analogy it's like we have a bunch of vectors we can measure (P, T, S, V, etc) but due to the constraints we're actually working in a 2 dimensional space. So we could form a basis from many different choices of vectors, and our coefficients would change accordingly.
As the other commenter said, you can make this analogy rigorous by looking at manifolds (differential geometry). They're a little bit like the non-linear version of a vector space. In this case the set of physically valid values for P, T, S, V forms a two-dimensional surface due to the ideal gas laws, and you can derive local coordinate charts for the surface using any (non-degenerate) pair of these variables.
Imagine a function z=f(x,y) in 3D space. Now picture a plane at say, x=3, that is parallel to the plane passing through the Y and Z axes. This x=3 plane cuts through our function, and its intersection with the z=f(x,y) function forms a sort of 2D function z=g(x)=f(3,y).
(The Wikipedia page[1] has nice images of this [2])
The slope of this new 2D function on the x=3 plane at some point y is then the partial derivative ∂z/∂y for constant x at the point (3,y). As we are "fixing" the value of x to a constant, by only considering the intersection of our original function with a plane at x=x_0.
That’s just the standard partial derivative in multivariable calculus. This one I have no trouble to understand. My question is about "at constant something" as used in thermodynamics, where "at constant something" is clearly doing more work than just "partial derivative". What work ? How ? Damned if I know.
Consider f(x,y,z), let’s say f(x, y, z) = x^2 + 3y^3 - e^(-z). What’s the difference between "the partial derivative of f with respect to x" and "the partial derivative of f with respect to x at constant y" ? The first one is already at constant y !
In standard multivariate calculus, the partial derivative of f with respect to x , as you explained, is always "at constant y and z".
In thermodynamics, you can say things like "partial derivative of pressure with respect to volume" and add "at constant temperature" or "at constant entropy" and get different results. What ? Why ? How ?
> things like "partial derivative of pressure with respect to volume" and add "at constant temperature"
They're the same thing, isn't it? Except that with add the "at constant temperature" addendum, you're just making explicit the other variable(s) that can potentially be varied. Without it, it just means all other variables, whatever they may be, are constant.
But if something depended on both temperature and some other quantity X, and you said "partial derivative of pressure with respect to volume at constant temperature," that would be sort-of misleading because you're only exlicitly mentioning one of the other two variables - rather, you should say "at constant temperature and X" or not mention either of them.
They aren't the same thing since the first is strictly speaking not well defined - see my answer to the OP. I think the problem is that physicists use the same letter, say U, to denote multiple different mathematical functions depending on the context. The "holding XXX constant" thing serves to tell you which function you're dealing with formally.
For problems in the plane, it's natural to pick two coordinate functions and treat other quantities as functions of these. For example, you might pick x and y, or r and θ, or the distances from two different points, or...
In thermodynamics, there often isn't really one "best" choice of two coordinate functions among the many possibilities (pressure, temperature, volume, energy, entropy... these are the must common but you could use arbitrarily many others in principle), and it's natural to switch between these coordinates even within a single problem.
Coming back to the more familiar x, y, r, and θ, you can visualize these 4 coordinate functions by plotting iso-contours for each of them in the plane. Holding one of these coordinate functions constant picks out a curve (its iso-contour) through a given point. Derivatives involving the other coordinates holding that coordinate constant are ratios of changes in the other coordinates along this iso-contour.
For example, you can think of evaluating dr/dx along a curve of constant y or along a curve of constant θ, and these are different.
I first really understood this way of thinking from an unpublished book chapter of Jaynes [1]. Gibbs "Graphical Methods In The Thermodynamics of Fluids" [2] is also a very interesting discussion of different ways of representing thermodynamic processes by diagrams in the plane. His companion paper, "A method of geometrical representation of the thermodynamic properties of substances by means of surfaces" describes an alternative representation as a surface embedded in a larger space, and these two different pictures are complimentary and both very useful.
Here's a geometric way of looking at it. I'll start with a summary, and then give a formal-ish description if that's more your jam.
---
The fundamental issue is physicists use the same symbol for the physical, measurable quantity, and the function relating it to other quantities. To be clear, that isn't a criticism: it's a notational necessity (there are too many quantities to assign distinct symbols for each function). But that makes the semantics muddled.
However, there is also a lack of clarity about the semantics of "quantities". I think it is best to think of quantities as functions over an underlying state space. Functional relationships _between_ the quantities can then be reconstructed from those quantities, subject to uniqueness conditions.
This gives a more natural interpretation for the derivatives. It highlights that an expression like S(U, N, V) doesn't imply S _is_ the function, just that it's associated to it, and that S as a quantity could be associated with other functions.
---
The state space S has the structure of a differential manifold, diffeomorphic to R^n [0].
A quantity -- what in thermodynamics we might call a "state variable" -- is a smooth real-valued function on S.
An diffeomorphism between S and R^n is a co-ordinate system. Its components form the co-ordinates. Intuitively, any collection of quantities X = (X_1, ..., X_n) which uniquely labels all points in S is a co-ordinate system, which is the same thing as saying that it's invertible. [1]
Given such a co-ordinate system, any quantity Y can naturally be associated with a function f_Y : R^n -> R, defined by f_Y(x_1, ..., x_n) := Y(X^-1(x_1, ..., x_n)). In other words, this is the co-ordinate representation of Y. In physics, we would usually write that, as an abuse of notation: Y = Y(X_1, ..., X_n).
This leads to the definition of the partial derivative holding some quantities constant: you map the "held constant" quantities and the quantity in the denominator to the appropriate co-ordinate system, then take the derivative of f_Y, giving you a function which can then be mapped back to a quantity.
In that process, you have to make sure that the held constant quantities and the denominator quantity form a co-ordinate system. A lot of thermodynamic functions are posited to obey monotonicity/convexity properties, and this is why. It might be also possible to find a more permissive definition that uses multi-valued functions, similar to how Riemann surfaces are used in complex analysis.
To do that we'd probably want to be a bit more general and allow for "partial co-ordinate systems", which might also be useful for cases involving composite systems. Any collection of quantities (Y, X_1, ..., X_n) can be naturally associated with a relation [2], where (y, x_1, ..., x_n) is in the relation if there exists a point s in S such that (Y(s), X_1(s), ..., X_n(s)) = (y, x_1, ..., x_n). You can promote that to a function if it satisfies a uniqueness condition.
I think it is also possible to give a metric (Riemannian) structure on the manifold in a way compatible with the Second Law. I remember skimming through some papers on the topic, but didn't look in enough detail.
---
[0] Or half of R^n, or a quadrant maybe.
[1] The "diffeomorphism" definition also adds the condition that the inverse be smooth.
[2] Incidentally, same sense of "relation" that leads to the "relational data model"!
Does someone has a good explanation ?