Kenneth Arrow and the golden age of economic theory

georgeglue1 · on April 10, 2017

To further appreciate what a giant Arrow was, this is a good overview with a couple of other examples as well: http://www-siepr.stanford.edu/ArrowShovenMay09.pdf

unknown_apostle · on April 11, 2017

« That age – which includes such historically important figures as Arrow’s fellow Nobel laureates Paul Samuelson and Gary Becker – represented a development and expansion of formal economic theory that brought unprecedented precision to the logical foundations of social science. »

And here we are, picking the fruits of this golden age of unprecedented precision. With a mountain of record low-yielding debt serving as the backbone for a mountain range of derivatives and synthetic products. With low savings and with record public participation in investments. Because the infomercials say the risk^D^D^D^Dvolatility is controlled by science and computers n shit.

paulpauper · on April 10, 2017

Sightly off-topic, but IMHO the theory of random walks, risk neutral pricing, and the black scholes option pricing formula to be the most significant findings of the past 100 years in the field of economics, in term of: practical applications (the option and futures market is huge, and they all involve these formulas), spawning research (if you go on arXiv, the original work done in the late 60's and 70's is still spawning tons of research to this day on asset pricing and asset price dynamics, whereas other economic fields seem to have stagnated), and being mathematically empirically sound (meaning it's a complete theory that does an adequate job describing reality), and is was groundbreaking in that the result was unexpected and answered a nagging question about how to price contracts without having to define a drift variable.

mcguire · on April 11, 2017

"black scholes option pricing formula"

There are some issues with it, though. It uses volatility as a proxy for risk---fortunately, past performance is a perfect predictor of future performance and future events can be predicted at least to the extent of normally distributed errors---and assumes a standard normal distribution---the fact that the actual distribution had a much heavier tail[1] can have no material effect, right?

[1] https://seekingalpha.com/amp/article/2291545-a-statistical-a...

srean · on April 11, 2017

> theory of random walks, risk neutral pricing, and the black scholes option pricing formula to be the most significant findings

Random walks, heck yes! Black Scholes not so much. Its a fantastic toy model, but over all I think, it did more harm than good. People took its thin tail behavior too seriously. There's way too much fluctuation in practice than a Gaussian process would fit/predict.

So I would put it as: mathematically sound enough, empirically not quite so.

JumpCrisscross · on April 11, 2017

> Its a fantastic toy model, but over all I think, it did more harm than good. People took its thin tail behavior too seriously.

Options theory is popularly misunderstood. Makes being a former options trader fun and annoying.

First, the impact. You can model almost anything as an option. Equity? Call option on a firm's assets struck at the liabilities. (Literally anything else? A portfolio of Arrow–Debreu options [0].) Modern risk models, and these include some very successful ones, would not work without Black Scholes.

Second, no professional uses a Gaussian assumption without knowing what they're doing. (In any case, people started re-writing Black Scholes for other curves since at least the 1990s.) No model can economically ascertain every possible risk. One must always choose risks one considers negligible. This is true in finance as in life. Sometimes we choose to ignore the wrong risks. When that happens, it's easier to blame highfalutin math than admit "I didn't think about that".

No amount of "fat tailing" will substitute rigorous risk management. Case in point: the multitude of funds launched after 2008 focussing on "black swans" and "fat tails". Pretty much all of them lost money [3]. They missed--with everyone else--Dubai's near default, Greece's actual default, the 2010 Flash Crash, 2011's summer volatility, China's 2015 crash, the following August's international crash, Brexit or practically anything else that one might consider both significant and unexpected. (No shit.)

The real "magic" in Black Scholes? It's not the distribution. It's volatility. Modern models expand this once single term into a multidimensional beast [4]. This is the other reason for the prevalence of Gaussian assumptions. Traders trade computation of the distribution for computation of the volatility surface. The latter (tau) almost always dominates the former (delta) in terms of what's being mispriced.

[0] https://en.wikipedia.org/wiki/Arrow–Debreu_model

[1] https://en.wikipedia.org/wiki/Portfolio_insurance

[2] https://en.wikipedia.org/wiki/Tulip_mania

[3] http://www.businessinsider.com/watch-out-investors-the-black...

[4] http://edoc.hu-berlin.de/series/sfb-649-papers/2005-20/PDF/2...

srean · on April 11, 2017

Vehemently agree, in particular this

> no professional uses a Gaussian assumption without knowing what they're doing

I would say Black Scholes set the theme, but the details had to be modified quite a bit. Unmodified BS is BS.

'Fat tailing' would be very unlikely to give one a better prediction accuracy. It can, however, give one a better appreciation of the risk.

JumpCrisscross · on April 11, 2017

> Unmodified [Black Scholes] is BS

Black-Scholes-Merton is a theory. Analogous to Newtonian mechanics or textbook thermodynamics. They all need modification to work in the real world. That doesn't make them BS.

> [Using a fat-tailed distribution] can, however, give one a better appreciation of the risk

How does one choose a model and parameters for events which are, by definition, hard to predict? Keep in mind that most "black swans" arise from unforeseen dimensions of risk. There's the "my stock lost 90% of its value" fat tail versus "the damn exchange went bust". There's "my clearing bank is broke" and "the Russians invaded my country".

VHRanger · on April 11, 2017

I agree that random walks are the most sensible way to model the financial markets, IMO. Anyone telling you otherwise you should be suspicious of (are they can beat the market?) Especially now with the mountain of evidence that pretty much no one can consistently beat a good market index.

lost953 · on April 11, 2017

The only issue with that is there are a few groups that have consistently beat the market over time, so perhaps its more pseudorandom than we think. One example might be the Renaissance Medallion Fund.

VHRanger · on April 11, 2017

Right, and I generally respect Renaissance.

I think people who actually can beat the market fall into the general category of "knows something others don't". Renaissance probably is one of the very few firms that could qualify as that on purely technocratic means (instead of the usual, which is shades of grey in what's really insider trading). The other category would be simply HFT, but I'm not sure how profitable that is at this point, though.

Most of the people who seem to beat the market, though, are generally just happy benefactors of survival bias (eg. the 10000 monkeys on typewrites effect)

graycat · on April 10, 2017

Gee, it's fun to read about Arrow's work!

From the OP, I see that I've been closer to Arrow's work than I knew!

In grad school, I had a relatively severe introduction to optimization including both linear and non-linear programming.

The non-linear programming was mostly about the Kuhn-Tucker conditions, and there the work was mostly about the Kuhn-Tucker necessary conditions. Kuhn and Tucker were long at Princeton. The guy who was the Chair of my Ph.D. oral exam had been a Tucker student.

Before I got to that grad program, I had carefully studied W. Rudin, 'Principles of Mathematical Analysis' (a.k.a. Baby Rudin) and W. Fleming, 'Functions of Several Variables'. In my first year of grad school, I also had a severe course in H. Royden, 'Real Analysis', the real part of W. Rudin, 'Real and Complex Analysis', Neveu, Breiman, Chung, etc.

So, that background gave tools that helped attack the Kuhn-Tucker conditions.

Intuitive view of the Kuhn-Tucker conditions (KTC): You are in a cave with an uneven floor and vertical walls and you want to find the lowest point. If you put down a marble and it starts to roll, then you are not at the lowest point. So, to be at the lowest point, it is necessary that the marble not roll.

But K-T wanted more: They wanted to say that necessarily the slope of the floor (calculus gradient) and the slopes of the constraints that define the walls are such that the slopes from the walls block moving along the slope of the floor. Right, the slopes from the walls form a cone that contains the slope from the floor (or its negative depending on maximizing or minimizing and the direction of the constraints, etc.) -- it's all about a cone.

Well, this stronger statement is true in nice cases, and for a nice case have to have some assumptions that the constraints are nice. For that there are various KI constraint qualifications, KTCQ, that are enough to make the KT statements about slopes true.

There are lots of KT CQs, and one question was, which imply the others?

For two famous KT CQs, one due to KT and one due to Zangwill, it was not known if they were independent.

So, as a grad student, I settled that -- they are independent. The proof was by counterexample -- I found some bizarre constraints.

To know that such bizarre (goofy, pathological, etc.) constraints could exist, I needed essentially a theorem

For a positive integer n, the real numbers R, Euclidean n-space R^n with the usual topology, and a subset C of R^n closed in that topology, there exists a function

f: R^n --> R

such that f is zero on closed set C, strictly positive otherwise, and infinitely differentiable. So, I proved that. For the KT CQ I didn't need all of infinitely differentiable, but I got that also.

This result is curious in part because some examples of a close set C can be surprisingly intricate, e.g., the Mandelbrot set, a sample path of Brownian motion, Cantor sets of positive measure, etc.

As I went to publish, I discovered that my work also answered a question asked but not answered in a paper by Arrow, Hurwicz, and Uzawa.

Of course Arrow got his Nobel Prize. A few years ago, so did Hurwicz. Last I heard, Uzawa was still waiting! Cute: As a grad student I answered a question asked but not answered by Arrow, Hurwicz, and Uzawa. Reading Rudin and Fleming helped!

Gee, in the OP, I see that Arrow was also interested in decision making under uncertainty. Well, my dissertation research was in best decision making over time under uncertainty -- stochastic optimal control.

I never took a course in economics. My Ph.D. advisor thought that I would need such a course if only later in my career to fend off nonsense objections from economists -- I've never needed that!

So, I signed up for an econ course, went the first day, sat in the front row, said nothing, and took careful notes. After the class when just the professor and I were there, I asked him what he was assuming for his supply and demand curves -- continuous, uniformly continuous, differentiable, continuously differentiable, infinitely differentiable, convex, pseudo-convex, quasi-convex, etc.? He said nothing.

Soon I got a call from my department secretary to call my Ph.D. advisor -- I was out of the econ course!

Still, the OP shows that I was closer to some mathematical economics than I knew!

Maybe someday some people in data science or artificial intelligence will exploit the KTC!

VHRanger · on April 11, 2017

> I asked him what he was assuming for his supply and demand curves -- continuous, uniformly continuous, differentiable, continuously differentiable, infinitely differentiable, convex, pseudo-convex, quasi-convex, etc.? He said nothing.

> Soon I got a call from my department secretary to call my Ph.D. advisor -- I was out of the econ course!

What the hell? The first two semesters of microeconomic theory in graduate school go over all of that. Supply and demand are formalized down to set theory coming up (from Arrow's work!)

If you are interested, the Mas-Colell/Whinston/Greene text is the bible all PhD students are forced through in microeconomics. Start at the set theory level, define what permits construction of a utility function, and get to defining supply and demand from there. Then get to game theory and other topics.

You even have the theorems where free markets fail to be an efficient mechanism of allocation! We've known those things for decades, but economics is so politicized that it's hard for information out.

graycat · on April 11, 2017

I was a grad student in applied math and had done the basic research for my dissertation, solved the KTCQ problem, was polishing the research, and writing the illustrative software when my advisor suggested I take an econ course.

The econ course was in the econ department, not my department, and was not an econ grad course!

But, whatever the course was, the econ prof was apparently just terrified of my question!

One way and another, maybe I've touched on much of what you mentioned. E.g., the optimization I studied, with the math rock solid, was a good start on game theory. Later, while in a part time job, to support us while my wife finished her Ph.D. and basically time-out from my Ph.D., I took a job in military systems analysis with a lot in game theory. So, I dug into parts of G. Owen's book on game theory which did the axiomatic utility function stuff and T. Parthasarathy and T. E. S. Raghavan which did a lot of fixed point theorems, Sion's result, Lemke's proof of Nash's result, etc. Sure, in the relatively general game theory the job had me in, I had to consider saddlepoint results, and apparently that is the core of equilibrium theory in econ.

Once a tried a book on math econ, and it was just a lot of elementary regression analysis. Later I saw another such, by Tata?, and more advanced but still regression -- a place to see more about regression than want to know, and maybe the AI people should take a look. Later saw another such book, by Duffie, and early on it was heavily about the Kuhn Tucker conditions. I read the first chapter or two quickly and had some questions, went back, read carefully, and found a counterexample for every statement in that material.

I did want to see a clean, solid, mathematical treatment of the Sharpe idea but didn't find that -- D. Luenberger, a good mathematician, has a book on finance that may have such a treatment.

Thanks for the reference on micro. I copied that in my place for such things and will look at it if I get interested in econ after I exit from my startup!

Well, nil nisi bonum.

VHRanger · on April 11, 2017

Most of the advanced math in economics in current research is either in econometrics or game theory. Not saying econ is math-less, far from it, but most of the time we can't simply solve our problems by applying advanced math like physics can.

Equilibrium concepts in game theory are a tough thing. The holy grail is still getting a unifying concept of a "stable equilibrium" (see Kohlber & Mertens '86) which is pretty much a guaranteed Nobel (but I'm not sure it even exists, so many geniuses worked on the problem without success).

southbridge · on April 11, 2017

> I asked him what he was assuming for his supply and demand curves -- continuous, uniformly continuous, differentiable, continuously differentiable, infinitely differentiable, convex, pseudo-convex, quasi-convex, etc.? He said nothing. > Soon I got a call from my department secretary to call my Ph.D. advisor -- I was out of the econ course!

This is my favorite economist joke: There is a physicist, chemist, and an economist stuck on a desert island together. They find a can of beans, and are discussing how to open the can. The physicist says "Lets climb up the tree and drop a rock on it. The force from the drop will make the can explode and open." Then the chemist says "We should put the can in some saltwater. The metal will corrode, and we can get inside." Finally the economist says, "Let's assume a can opener..."

graycat · on April 14, 2017

Yup, on wild assumptions, considering utility functions, the average busy housewife and mother of four children, all under 6, goes to a big grocery store, gets a list of all the items for sale and their prices, notes her utility function as a function of the whole inventory, notes her grocery budget, and solves the likely NP-complete, non-linear, discrete (integer) optimization problem, maybe under uncertainty if she is buying extra bananas on sale and risking that they go bad too soon or buying too little fresh chicken hoping that they may be on sale in two days, etc., all in her head, right away!

Yup, and if buy a lot more transistors, then the price for each will go up? Hmm. Transistors used to be several dollars each, and now can get a billion or so for less than $100.

If buy more disk space, then the price per byte will go up? Gee, it used to be that got 300 MB for $40,000 and now can get 2 TB for about $50.

If buy more in computing, then the price per unit of computing will go up? Hmm, used to be could get a nice DEC dumb terminal for $1400, and now can get one heck of a desktop computer for that.

So, sometimes, if buy a lot more of something and wait a while, then the price per unit can go down, not up. If don't want to wait, then buy in quantity and get a volume discount. Or, instead of buying the teeny, tiny, itty bitty bottles of sweet pickle relish, buy the gallon size at much less cost per ounce. If are buying in really big quantities, say, Hertz buying a fleet of cars, then Ford can put on an extra shift and get the price way down just for Hertz.

Net, that one day in econ class looking at free hand apparently differentiable convex supply curves was a bummer.

srean · on April 11, 2017

>Maybe someday some people in data science or artificial intelligence will exploit the KTC

Climb out of under the rock will you :) [We have met enough over HN that I thought I could take the liberty in good fun]

As I have said (little less than a million times to you) there is more to data-science/machine-learning than Breiman's CART which for some reason you have latched on to. If you do a random walk in machine learning concepts the parts that don't involve KTC is a set of measure zero.

Read Vapnik, I can bet you will like it: Glivenko Cantelli on steroid meets KTC that is what Vapnik's results are. I doubt anyone will dispute that he is the father of statistical learning theory.

graycat · on April 11, 2017

That rock you mention is where I work on my startup -- with money in mind.

My main reason to do applied math is to apply it, to business, the money making kind.

Yes you have told me about Vapnik before, but the rest I see, e.g., deep learning, smells like tweaks on Breiman's CART. I have a lot of respect for Breiman, maybe more for his fellow student Neveu as they were at Berkeley under Loeve.

I have a tough time taking Silicon Valley seriously on anything serious about machine learning -- new words for statistical model building and estimation but done with much more data.

In

Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The Elements of Statistical Learning Data: Mining, Inference, and Prediction, Second Edition, Springer, 2008.

I find mention of Vapnik, mostly for separating hyperplanes, and references

Vapnik, V. (1996). The Nature of Statistical Learning Theory, Springer, New York.

Vapnik, V. (1998). Statistical Learning Theory, Wiley, New York.

Looking at Hastie, et al., which seems relatively elementary, I wouldn't expect to find much on the Kuhn-Tucker conditions or Glivenko Cantelli.

In

Kevin P. Murphy, Machine Learning: A Probabilistic Perspective, ISBN 978-0-262-01802-9, MIT Press, 2012.

I saw no mention of Vapnik.

In

Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning From Theory to Algorithms, 2014.

I see

Vapnik, V. (1992), Principles of risk minimization for learning theory, in J. E. Moody, S. J. Hanson & R. P. Lippmann, eds, `Advances in Neural Information Processing Systems 4', Morgan Kaufmann, pp. 831{838.

Vapnik, V. (1995), The Nature of Statistical Learning Theory, Springer.

Vapnik, V. N. (1982), Estimation of Dependences Based on Empirical Data, Springer- Verlag.

Vapnik, V. N. (1998), Statistical Learning Theory, Wiley.

Vapnik, V. N. & Chervonenkis, A. Y. (1971), `On the uniform convergence of relative frequencies of events to their probabilities', Theory of Probability and its applications XVI(2), 264{280.

Vapnik, V. N. & Chervonenkis, A. Y. (1974), Theory of pattern recognition, Nauka, Moscow. (In Russian).

Okay, that may be a first-cut list of what to read on Vapnik.

Gee, machine learning and not from Silicon Valley -- the second part is a step up! Not from Pittsburgh either -- two steps up!

Probabilistic and from Russia? More steps up!

For more steps up, maybe some French or Japanese could contribute?

For such things, I could take Bertsekas at MIT seriously.

I'm deep into my startup; there I've done my applied math derivations and written the code.

It is true that I have some suspicions that some old techniques tweaked a little and maybe some newer techniques might increase accuracy some, but I'm leaving that on the back burner for now, avoiding premature optimization.

But I'll index this post and not forget about Vapnik.

srean · on April 11, 2017

Indeed, I can imagine (and still grossly underestimate) how frantically busy you would be. All the best for your startup. As I said, it was just a friendly ribbing. Given our interactions on HN I have come to gauge your taste a bit, hence my recommendations.

Vapnik, V. N. (1998), Statistical Learning Theory is a tome. You probably would not have time to read this. I would say go with the other two books of his. They sort of summarize his body of early work. Again a lot of water has flown below the Vapnik bridge, but you will get a non-hyped view of what ML is about.

Kevin P. Murphy, Machine Learning: A Probabilistic Perspective is also a hefty read, but its a good source and quite complementary to Vapnik's treatise. Kevin Murphy's book is about how to tractably model joint distribution of a (mostly discrete) set of random variables and draw inferences from them. So this is about smart and efficient ways to marginalize and condition and compress.

Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning From Theory to Algorithms, 2014

This is also a great book. You will see a lot of KTC here but mostly within the framework of convexity and duality.

petters · on April 11, 2017

Interesting, I've always known them as Karush–Kuhn–Tucker conditions. Karush had them in his master's thesis like a decade earlier.

graycat · on April 14, 2017

Yes, I've heard about nearly only Kuhn-Tucker; apparently Karush got ripped off.

E.g., it's the Cooley-Tukey fast Fourier transform, but IIRC once that work became popular, and for a while was wildly popular and now is likely just crucial at the core of lots of work, some people found work similar or the same going way back. Still Cooley-Tukey get credit.

Rainymood · on April 11, 2017

>Intuitive view of the Kuhn-Tucker conditions (KTC): You are in a cave with an uneven floor and vertical walls and you want to find the lowest point. If you put down a marble and it starts to roll, then you are not at the lowest point. So, to be at the lowest point, it is necessary that the marble not roll.

Lovely analogy, I'll use this when trying to explain this to people from now on!

petters · on April 11, 2017

That such a function f exists is pretty standard, right? I know we discussed it during my undergrad studies. See e.g. Counterexamples in Analysis.

Also, a point is a closed set, so you probably want non-empty interior.

graycat · on April 11, 2017

No, it's not standard. I doubt that it is in Counterexamples in Analysis, one of my favorite books.

No, don't need a non-empty interior for the closed set. The closed set can even be empty. The function is to be zero on the closed set and positive otherwise. For a single point could use just the squared distance to the point, i.e., a parabola.

For a closed set like the Mandelbrot set and for the general case, have to try harder.

For a fast version of the proof, outside the closed set is an open set. If it is empty, then just let the function f = 0. Else in that open set, pick a countable dense set. Roughly for each point in that countable dense set, solve the problem for that set, and then add all the countably many partial solutions in a convergent way.

A key is a Baby Rudin exercise of a function g: R --> R where

g(x) = 0 for x <= 0

g(x) > 0 for x > 0

and g is infinitely differentiable. Sure, g is based on an exponential.

Then for f: R^2, at each of the countably many points, spin g to make a smooth hill where the bottom of the hill just touches the closed set C. The spin, being intuitive here, also works for f: R^n. Sure, for each of the countably infinitely many points, the spin, in g(z), z is the distance to the to the point. I'm being intuitive and sloppy here to be brief and easier to understand.

Then in R^2, at the origin, for each positive rational of the form p/q in lowest terms, have a ray from the origin at angle p radians and length 1/q. That's a bizarre closed set. Then at the origin, the Zangwill and KTCQ don't agree, are independent. Here, again, I'm being brief.

petters · on April 11, 2017

Sorry, I misread the theorem earlier. Thanks!