{"id":29014,"date":"2016-06-08T10:21:54","date_gmt":"2016-06-08T15:21:54","guid":{"rendered":"http:\/\/blogs.ams.org\/mathgradblog\/?p=29014"},"modified":"2016-06-13T20:12:45","modified_gmt":"2016-06-14T01:12:45","slug":"okcupid-math-online-dating","status":"publish","type":"post","link":"https:\/\/blogs.ams.org\/mathgradblog\/2016\/06\/08\/okcupid-math-online-dating\/","title":{"rendered":"OKCupid: The Math Behind Online Dating"},"content":{"rendered":"<h3>Guest Author: Michalina Malysz<\/h3>\n<p>&#8220;Like you use sentences to tell a person a story; you use algorithms to tell a story to a computer&#8221; (Rudder 2013).<\/p>\n<p>In today\u2019s day and age, we have the world at our fingertips. The internet has made many things easier, including dating, allowing us to interact and connect with a plethora of new people&#8211;even those that were deemed unreachable just fifteen minutes beforehand.<\/p>\n<p><em><a href=\"https:\/\/www.youtube.com\/watch?v=m9PiPlRuy6E\">Inside OKCupid: The math behind online dating<\/a> <\/em>talks about the math formula that is used to match people with others on the website OKCupid,\u00a0the number one website behind online dating.<!--more--> Christian Rudder, one of the founders of OKCupid, examines how an algorithm can be used to link two people and to examine their compatibility based on a series of questions. As they answer more questions with similar answers, their compatibility increases.<\/p>\n<p>You may be asking yourself how we explain the components of human attraction in a way that a computer can understand it. \u00a0 Well, the number one component is research data. OKCupid collects data by asking users to answer questions: these questions can range from minuscule subjects like taste in movies or songs to major topics like religion or how many kids the other person desires.<\/p>\n<p>Many would think these questions were based on matching people by their likes; it does often happen that people answer questions with opposite responses. When two people disagree on a question asked, the next smartest move would be to collect data that would compare answers against the answers of the ideal partner and to add even more dimension to this data (such as including a level of importance). For example- What role do the certain question(s) play in the subject&#8217;s life? What level of relevancy are they? In order to calculate compatibility, the computer must find a way to compare the answer to each question, the ideal partner&#8217;s answer to each question and the level of importance of the question against that of someone else&#8217;s answers. The way that this is done is by using a weighted scale for each level of importance as seen below:<\/p>\n<p>Level of Importance\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Point Value<br \/>\nIrrelevant\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 0<br \/>\nA Little Important\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 1<br \/>\nSomewhat Important\u00a0\u00a0\u00a0\u00a0 10<br \/>\nVery Important\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 50<br \/>\nMandatory\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 250<\/p>\n<p>You may be asking yourself &#8216;How is this computed?: Let&#8217;s say you are person A and the person the computer is trying to match you with is person B. The overall question would be: How much did person B&#8217;s answers satisfy you? The answer is set up as a fraction. The denominator is the total number of points that you allocated for the importance of what you would like. The numerator is the total number of points that person B&#8217;s answers received. Points are given depending on the other person&#8217;s response to what you were looking for. The number of points is based on what level of importance you designated to that question.<\/p>\n<p>This is done for each question; the fractions are then added up and turned into percentages. The final percentage is called your percent satisfactory &#8211; how happy you would be with person B based on how you answered the questions. Step two is done similarly, except, the question to answer is how much did your answers satisfy person B. So after doing the computation we are a left with a percent satisfactory of person B.<\/p>\n<p>The overall algorithm that OKCupid uses is to take the n-root of the product of person A\u2019s percent satisfaction and person B\u2019s percent satisfaction. This is a mathematical way of expressing how happy you would be with each other based on how you answered the questions for the computer. Why use this complex algorithm of multiplication and square-rooting\u00a0when you can just take the average of the two scores? Well, a geometric mean, which is &#8220;a type of mean or average which indicates the central tendency or typical value of a set of numbers&#8221; (Rudder, 2013), is ideal for this situation because it is great for sets of values with wide ranges and is great at comparing values that represent very different properties, such as your taste in literature and your plans for the future and even whether or not you believe in God (best of all, the algorithm can still be useful even when there is a very small set of data). It uses margin of error, which is &#8220;a statistic expressing the amount of random sampling error in a surveys results&#8221; (Rudder, 2013), to give person A the most confidence in the match process. It always shows you the lowest match percentage possible because they want person A and person B to answer more questions to increase the confidence of the match. For example, if person A and B only had answered two of the same questions margin of error for that sample size will be 50%. This means that the highest possible match percentage is 50%. Below I have included a table that shows how many of the same questions (size of s) must be answered by 2 people in order to get a .001 margin of error or a 99.99% match.<\/p>\n<p><a href=\"http:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure1_MarginError.png\" rel=\"attachment wp-att-28909\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-28909\" src=\"http:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure1_MarginError-300x256.png\" alt=\"Michalina Malysz_925510_assignsubmission_file_Figure1_MarginError\" width=\"300\" height=\"256\" srcset=\"https:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure1_MarginError-300x256.png 300w, https:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure1_MarginError.png 670w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a>Now that we know how the computer comes up with this algorithm, it makes you wonder how do these match percentages affect the odds of person A sending one or more messages to person B. It turns out that people at OKCupid had been interested in this question as well and had messed with some of the matches in the name of science. It turns out that the percent match actually does have an effect on the likelihood of a message being sent and the odds of a single message turning into a conversation. For example, if person A was told that they were only a 30% match with person B (and they were only a 30% match), then there&#8217;s a 14.2 % chance that a single message would be sent and about a 10% chance of a single message turning into a conversation of four or more messages. However if person A was told that they are 90% match (even if they are only a 30% match), then the odds of sending one message is 16.9% and the odds that the one message turns into exchanging 4 or more is 17% .<\/p>\n<p><a href=\"http:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure2_30match.png\" rel=\"attachment wp-att-28910\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-28910\" src=\"http:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure2_30match-300x118.png\" alt=\"Michalina Malysz_925510_assignsubmission_file_Figure2_30match\" width=\"300\" height=\"118\" srcset=\"https:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure2_30match-300x118.png 300w, https:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure2_30match-768x301.png 768w, https:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure2_30match.png 800w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p><a href=\"http:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure3_ActualvsDisplay.png\" rel=\"attachment wp-att-28911\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-28911\" src=\"http:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure3_ActualvsDisplay-300x128.png\" alt=\"Michalina Malysz_925510_assignsubmission_file_Figure3_ActualvsDisplay\" width=\"300\" height=\"128\" srcset=\"https:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure3_ActualvsDisplay-300x128.png 300w, https:\/\/blogs.ams.org\/mathgradblog\/files\/2016\/05\/Michalina-Malysz_925510_assignsubmission_file_Figure3_ActualvsDisplay.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a>I believe that the future of online dating is very broad and exciting. However I have some concerns about the algorithm and that it relies heavily on a person&#8217;s honesty and self-assessment. If I was to further analyze this topic I would look into how the length of the first message affects the response rates. Also, how it affects the odds that the conversation will continue for four or more messages and whether those messages would the same length or longer\/shorter than the initial message sent. The extent of the questions that have yet to be asked about this particular set of data and the idea of online dating\/ matching with people who are possibly oceans away are enormous; however, the data will linger on the Internet for many years to come and I&#8217;m sure will analyzed hundreds of times more to answer many many more questions.<\/p>\n<p>Citations:<br \/>\nHill, K. (2014, July 28.).\u00a0<em>OKCupid Lied To Users About Their Compatibility As An Experiment<\/em>. <a href=\"http:\/\/www.forbes.com\/sites\/kashmirhill\/2014\/07\/28\/okcupid-%09experiment-compatibility-deception\/#4cbde4745eb1\">http:\/\/www.forbes.com\/sites\/kashmirhill\/2014\/07\/28\/okcupid-experiment-compatibility-deception\/#4cbde4745eb1<br \/>\n<\/a>Match Percentage. (n.d.). Retrieved April 26, 2016, from https:\/\/www.okcupid.com\/help\/match-percentages<br \/>\nRudder, C.\u00a0(2013,\u00a0February 13).\u00a0I<em>nside OKCupid: The math of online dating.\u00a0<\/em>[Video \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 file]. <u>https:\/\/www.youtube.com\/watch?v=m9PiPlRuy6E<br \/>\n<\/u>Rudder,\u00a0C. (2014).\u00a0<em>Dataclysm: Who We Are*.\u00a0<\/em>New York:\u00a0Crown Publishers.<br \/>\n<em>Figure 1<\/em>. Margin of error vs. highest possible match. From \u201cMatch Percentage,\u201d <a href=\"https:\/\/www.okcupid.com\/help\/match-percentages\">https:\/\/www.okcupid.com\/help\/match-percentages<\/a>. Copyright[2015] by OkCupid. Reprinted with permission.<br \/>\n<em>Figure 2. <\/em>Odds of sending one and\/or more messages from 30% match. From \u201cOkCupid Lied To Users About Their Compatibility As An Experiment,\u201d by Kashmir Hill, 2014, <a href=\"http:\/\/www.forbes.com\/sites\/kashmirhill\/2014\/07\/28\/okcupid-experiment-compatibility-deception\/2\/#2f78a64f5eb1\">http:\/\/www.forbes.com\/sites\/kashmirhill\/2014\/07\/28\/okcupid-experiment-compatibility-deception\/2\/#2f78a64f5eb1<\/a>. Copyright [2014] by Forbes. Reprinted with permission.<br \/>\n<em>Figure 3. <\/em>Odds of a single message turning into a conversation based on match percent. From \u201cOkCupid Lied To Users About Their Compatibility As An Experiment,\u201d by Kashmir Hill, 2014, <a href=\"http:\/\/www.forbes.com\/sites\/kashmirhill\/2014\/07\/28\/okcupid-experiment-compatibility-deception\/2\/#2f78a64f5eb1\">http:\/\/www.forbes.com\/sites\/kashmirhill\/2014\/07\/28\/okcupid-experiment-compatibility-deception\/2\/#2f78a64f5eb1<\/a>. Copyright [2014] by Forbes. Reprinted with permission.<\/p>\n<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>","protected":false},"excerpt":{"rendered":"<p>Guest Author: Michalina Malysz &#8220;Like you use sentences to tell a person a story; you use algorithms to tell a story to a computer&#8221; (Rudder 2013). In today\u2019s day and age, we have the world at our fingertips. The internet &hellip; <a href=\"https:\/\/blogs.ams.org\/mathgradblog\/2016\/06\/08\/okcupid-math-online-dating\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" data-url=https:\/\/blogs.ams.org\/mathgradblog\/2016\/06\/08\/okcupid-math-online-dating\/><\/div>\n","protected":false},"author":93,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[14,19,21],"tags":[234,96,235],"class_list":["post-29014","post","type-post","status-publish","format-standard","hentry","category-math-in-pop-culture","category-statistics","category-technology-math","tag-math-and-dating","tag-statistics-2","tag-videos"],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p3gbww-7xY","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/posts\/29014","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/users\/93"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/comments?post=29014"}],"version-history":[{"count":2,"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/posts\/29014\/revisions"}],"predecessor-version":[{"id":29021,"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/posts\/29014\/revisions\/29021"}],"wp:attachment":[{"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/media?parent=29014"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/categories?post=29014"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.ams.org\/mathgradblog\/wp-json\/wp\/v2\/tags?post=29014"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}