{"id":1062,"date":"2016-04-30T16:36:58","date_gmt":"2016-04-30T21:36:58","guid":{"rendered":"http:\/\/blogs.ams.org\/beyondreviews\/?p=1062"},"modified":"2018-03-14T08:29:08","modified_gmt":"2018-03-14T13:29:08","slug":"happy-birthday-claude-shannon","status":"publish","type":"post","link":"https:\/\/blogs.ams.org\/beyondreviews\/2016\/04\/30\/happy-birthday-claude-shannon\/","title":{"rendered":"Happy Birthday, Claude Shannon"},"content":{"rendered":"<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Claude_Shannon\">Claude Shannon<\/a> is famous among mathematicians and computer scientists for his remarkable work, particularly in the realm of \u00a0information theory. \u00a0Of particular importance is Shannon&#8217;s notion of information entropy, often referred to now as &#8220;Shannon entropy&#8221;. \u00a0He launched the theory in 1948 in a remarkable paper titled, &#8220;<a href=\"http:\/\/ieeexplore.ieee.org\/stamp\/stamp.jsp?tp=&amp;arnumber=6773024\">A Mathematical Theory of Communication<\/a>&#8220;. \u00a0 Below, I will reproduce <a href=\"http:\/\/www.ams.org\/mathscinet\/MRAuthorID\/59155\">Joseph Doob<\/a>&#8216;s review of Shannon&#8217;s famous paper. \u00a0<!--more--><\/p>\n<div style=\"width: 177px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-1978\" src=\"http:\/\/blogs.ams.org\/beyondreviews\/files\/2018\/03\/ClaudeShannon_MFO3807.jpg\" width=\"284\" height=\"400\" alt=\"Claude Shannon\" srcset=\"https:\/\/blogs.ams.org\/beyondreviews\/files\/2018\/03\/ClaudeShannon_MFO3807.jpg 284w, https:\/\/blogs.ams.org\/beyondreviews\/files\/2018\/03\/ClaudeShannon_MFO3807-213x300.jpg 213w\" sizes=\"auto, (max-width: 284px) 100vw, 284px\" \/><p class=\"wp-caption-text\">Photo credit: <a href=\"http:\/\/owpdb.mfo.de\/detail?photo_id=3807\">Oberwolfach Photo Collection<\/a> (<a href=\"http:\/\/creativecommons.org\/licenses\/by-sa\/2.0\/de\/deed.en\">CC BY-SA 2.0 DE<\/a>)<\/p><\/div>\n<p>During World War II, he worked briefly with Alan Turing, when Turing came to the United States to advise\u00a0the American crypto efforts. \u00a0Shannon had a playful streak, too. \u00a0He was an avid juggler, and enjoyed unicycles. \u00a0Shannon was fond of toys. \u00a0He constructed a version of <a href=\"http:\/\/web.media.mit.edu\/~minsky\/\">Marvin Minsky<\/a>&#8216;s\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Useless_machine\">useless machine,<\/a> a device whose only function is to turn itself off. \u00a0There is a great <a href=\"https:\/\/www.youtube.com\/watch?v=cZ34RDn34Ws\">video<\/a>\u00a0of one such machine available. \u00a0\u00a0<a href=\"http:\/\/www.ams.org\/mathscinet\/MRAuthorID\/743165\">Siobhan Roberts<\/a> has an excellent <a href=\"http:\/\/www.newyorker.com\/tech\/elements\/claude-shannon-the-father-of-the-information-age-turns-1100100\">profile of Shannon<\/a> in the New Yorker. \u00a0Other tributes abound, of course.<\/p>\n<hr \/>\n<p class=\"headline\"><strong><a href=\"http:\/\/www.ams.org\/mathscinet-getitem?mr=26286\">MR0026286<\/a><\/strong><br \/>\n<a href=\"http:\/\/www.ams.org\/mathscinet\/search\/author.html?mrauthid=159675\">Shannon, C. E.<\/a><br \/>\n<span class=\"title\">A mathematical theory of communication.<\/span><br \/>\n<a href=\"http:\/\/www.ams.org\/mathscinet\/search\/journaldoc.html?cn=Bell_System_Tech_J\"><em>Bell System Tech. J.<\/em><\/a> <strong>27, <\/strong>(1948). 379\u2013423, 623\u2013656.<br \/>\n<a href=\"http:\/\/www.ams.org\/mathscinet\/search\/mscdoc.html?code=60.0X\">60.0X<\/a><\/p>\n<p class=\"sfx\">\u00a0The author considers an information source which can produce any one of a finite number of symbols. These are fed into a channel for transmission, each symbol having a positive time duration in the channel. The problem becomes statistical with the assumption that if <span class=\"MathTeX\">$x_n$<\/span> is the <span class=\"MathTeX\">$n$<\/span>th symbol produced by the source the <span class=\"MathTeX\">$x_n$<\/span> process is a stationary stochastic process. (Further hypotheses of Markov type are also made in the applications.) The concept of the entropy <span class=\"MathTeX\">$H(U)$<\/span> of a random variable <span class=\"MathTeX\">$U$<\/span> which can assume only a finite number of values, with probabilities <span class=\"MathTeX\">$p_1,p_2,cdots$<\/span>, is fundamental: <span class=\"MathTeX\">$H(U)=-sum_ip_ilog p_i$<\/span>. The entropy is a measure of the uncertainty in the value of <span class=\"MathTeX\">$U$<\/span>, and has various suggestive properties indicating this; for example <span class=\"MathTeX\">$H(U)$<\/span> is greater than or equal to the entropy of <span class=\"MathTeX\">$U$<\/span> conditioned by the knowledge of a second random variable. If the <span class=\"MathTeX\">$x_n$<\/span>&#8216;s above are mutually independent the entropy of the information source is defined as <span class=\"MathTeX\">$H(x_n)$<\/span>, and is a measure of the amount of information available in the source for transmission. A slightly modified definition of source entropy is made if the <span class=\"MathTeX\">$x_n$<\/span>&#8216;s are not mutually independent. The capacity of a channel is defined as <span class=\"MathTeX\">$C=lim_{Trightarrowinfty}log N(T)\/T$<\/span>, where <span class=\"MathTeX\">$N(T)$<\/span> is the maximum number of sequences of symbols that can be transmitted in time <span class=\"MathTeX\">$T$<\/span>, and it is shown that for given <span class=\"MathTeX\">$H$<\/span> and <span class=\"MathTeX\">$C$<\/span> the symbols can be encoded in such a way that <span class=\"MathTeX\">$C\/H-epsilon$<\/span> symbols per second can be transmitted over the channel if <span class=\"MathTeX\">$epsilon&gt;0$<\/span> but not if <span class=\"MathTeX\">$epsilon&lt;0$<\/span>. Many examples are given and the effect of noise on the transmission is treated in some detail. For example, if <span class=\"MathTeX\">$Hleq C$<\/span> above, there exists a coding system such that the output of the source can be transmitted over the channel with an arbitrary small error frequency. [For the early beginnings of this theory cf. Nyquist, same J. <span class=\"bf\">3,<\/span> 324\u2013346 (1924); Hartley, ibid. <span class=\"bf\">7,<\/span> 535\u2013563 (1928).]<\/p>\n<p class=\"review\">In the second part of the paper the author treats absolutely continuous distributions. The entropy of a distribution in one or more dimensions, with density <span class=\"MathTeX\">$p(x)$<\/span>, is defined as <span class=\"MathTeX\">$-int p(x)log p(x),dx$<\/span> and various qualitative properties of entropy are discussed. For example, the entropy of a distribution in one dimension, with finite dispersion <span class=\"MathTeX\">$sigma^2$<\/span>, is a maximum (relative to a not very clearly specified set of unbounded distributions) for a normal distribution. The entropy per degree of freedom of a stationary and ergodic process with random variables <span class=\"MathTeX\">$x_1,x_2,cdots$<\/span> is defined as <span class=\"MathTeX\">$H&#8217;=lim_{nrightarrowinfty}H_n\/n$<\/span>, where <span class=\"MathTeX\">$H_n$<\/span> is the entropy of the distribution of <span class=\"MathTeX\">$x_1,cdots,x_n$<\/span>, and it is stated that if <span class=\"MathTeX\">$p(x_1,cdots,x_n)$<\/span> is the density of the latter distribution, then <span class=\"MathTeX\">$n^{-1}log p$<\/span> converges in probability to <span class=\"MathTeX\">$H&#8217;$<\/span>. The entropy decrease caused by the action of a linear filter on a stationary ergodic process is evaluated. The rate <span class=\"MathTeX\">$R$<\/span> of transmission of information over a channel is defined and the channel capacity is then defined as the maximum of <span class=\"MathTeX\">$R$<\/span> for an arbitrarily varying input. For example the capacity of a channel of band <span class=\"MathTeX\">$0$<\/span>to <span class=\"MathTeX\">$W$<\/span> cps perturbed by white noise of power <span class=\"MathTeX\">$N$<\/span>, for average transmission power <span class=\"MathTeX\">$P$<\/span>, is <span class=\"MathTeX\">$Wlog(P+N)\/P$<\/span> and this can be interpreted to mean that proper coding will allow the transmission of this many binary digits of information per second with arbitrarily small error frequency. Finally the rate of generating information relative to a certain degree of fidelity is defined and discussed. The discussion is suggestive throughout, rather than mathematical, and it is not always clear that the author&#8217;s mathematical intentions are honorable. The point of view is that stressed by Wiener in his NDRC report [soon to be published as a book] &#8220;The Interpolation, Extrapolation, and Smoothing of Stationary Time Series,&#8221; in which communication is considered as a statistical problem, specifically in its mathematical formulation as the study of stationary stochastic processes, and of the results of various operations performed on them.<\/p>\n<p><span class=\"ReviewedBy\">Reviewed by J. L. Doob<\/span><\/p>\n<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>","protected":false},"excerpt":{"rendered":"<p>Claude Shannon is famous among mathematicians and computer scientists for his remarkable work, particularly in the realm of \u00a0information theory. \u00a0Of particular importance is Shannon&#8217;s notion of information entropy, often referred to now as &#8220;Shannon entropy&#8221;. \u00a0He launched the theory &hellip; <a href=\"https:\/\/blogs.ams.org\/beyondreviews\/2016\/04\/30\/happy-birthday-claude-shannon\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" data-url=https:\/\/blogs.ams.org\/beyondreviews\/2016\/04\/30\/happy-birthday-claude-shannon\/><\/div>\n","protected":false},"author":86,"featured_media":1978,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[34,8],"tags":[],"class_list":["post-1062","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-anniversaries","category-mathematicians"],"jetpack_featured_media_url":"https:\/\/blogs.ams.org\/beyondreviews\/files\/2018\/03\/ClaudeShannon_MFO3807.jpg","jetpack_shortlink":"https:\/\/wp.me\/p6C2KK-h8","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/posts\/1062","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/users\/86"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/comments?post=1062"}],"version-history":[{"count":12,"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/posts\/1062\/revisions"}],"predecessor-version":[{"id":1980,"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/posts\/1062\/revisions\/1980"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/media\/1978"}],"wp:attachment":[{"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/media?parent=1062"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/categories?post=1062"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.ams.org\/beyondreviews\/wp-json\/wp\/v2\/tags?post=1062"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}