Inverse Functions: We’re Teaching It All Wrong!

By Frank Wilson, Chandler-Gilbert Community College; Scott Adamson, Chandler-Gilbert Community College; Trey Cox, Chandler-Gilbert Community College; and Alan O’Bryan, Arizona State University

What would you do if you discovered a popular approach to teaching inverse functions negatively affected student understanding of the underlying ideas? Would you continue to teach the problematic procedure or would you search for a better way to help students make sense of the mathematics?

A popular approach to finding the inverse of a function is to switch the \(x\) and \( y\) variables and solve for the \(y\) variable. The strategy of swapping variables is not grounded in mathematical operations and, we will argue, is nonsensical. Nevertheless, the procedure is so ingrained in textbooks and other curricula that many teachers accept it as mathematical truth without questioning is conceptual validity. As a result, students try to memorize the strategy but struggle to “accurately carry out mathematical procedures, understand why those procedures work, and know how they might be used and their results interpreted” (NCTM, 2009; Carlson & Oehrtman, 2005). As we will illustrate, this common process for finding the inverse of a function makes it harder for students to understand fundamental inverse function concepts.

Foundational Ideas about Functions and Their Inverses
A function \(f\) describes the relationship between two covarying quantities represented by variables \(x\) and \(y\). Without loss of generality, let \(x\) be the independent variable for the function \(f\) and \(y\) be the dependent variable for the function \(f\). The inverse function \(f^{-1}\) also describes the relationship between the quantities represented by variables \(x\) and \(y\) except \(y\) is designated as the independent variable for the function \(f^{-1}\) and \(x\) is designated as the dependent variable for the function \(f^{-1}\).

The following properties hold:

Concept #1: The domain of a function \(f\) is the range of its inverse function \(f^{-1}\) and the range of the function \(f\) is the domain of its inverse function \(f^{-1}\) (Wilson, 2007).

Concept #2: \(f^{-1}(f(x))=x\). In layman’s terms, the inverse function undoes whatever the function does (Bayazit & Gray, 2004).

These two concepts form the foundational ideas of the inverse function concept and hold true for functions represented in equations, graphs, tables or words.

Problematic Conceptions Arising from the Switch x and y Approach to Finding Inverse Functions
We define a conception as “problematic” if it describes an understanding that obscures connections to related ideas, introduces mathematical inconsistencies, and/or is likely to hinder students from developing powerful meanings of future topics. There are two problematic conceptions that emerge from the switch \(x\) and \(y\) approach to finding the inverse of a function.

Problematic Conception #1: The inverse of \(y=f(x)\) is \(y=f^{-1}(x)\).
In this statement, the independent variable for both \(f\) and \(f^{-1}\) is \(x\) and the dependent variable for both functions is \(y\). This problematic conception develops out of the procedure of switching \(x\) and \(y\) to find the inverse of a function, as illustrated in the following example.

Given \(f(x)=86x+15\), find \(f^{-1}\).  \[\begin{align*} f(x) &= 86x +15\\ y &= 86x +15\quad \text{since}\ y= f(x)\\ x&=86y+15\quad \textbf{switch x and y}\\ x-15 &=86y\\ y &= \frac{x-15}{86}\\ f^{-1}(x) &= \frac{x-15}{86} \end{align*}\]

To some educators, calling this statement a problematic conception may seem like heresy. However, it is easy see the conceptual problem when the variables are assigned real world meanings.

In 2016 – 2017, tuition at the Maricopa Community Colleges was \(\$86\) per credit hour. All students registering to take classes were also required to pay a \(\$15\) registration fee. The function \(y=f(x)\) where \(f(x)=86x+15\) (introduced earlier) relates the number of credit hours, \(x\), to the total tuition cost (including the registration fee), \(y\). For clarity and emphasis, we change the variables in this equation to \(c\), for the number of credit hours assigned, and to \(t\), for the total tuition cost in dollars. The resultant equation is \(t=f(c)\) where \(f(c)=86c+15\). No matter what we do to mathematically manipulate this equation, the meaning of the variables \(t\) and \(c\) will remain unchanged. Suppose we are asked to calculate and interpret the meaning of \(f^{-1} (445)\). Using the switch \(x\) and \(y\) approach, we concluded earlier that \(y=f^{-1} (x)\) where \(f^{-1}(x)=\frac{x-15}{86}\). In terms of \(c\) and \(t\) this is \(t=f^{-1} (c)\) where \( f^{-1} (c)=\frac{c-15}{86}\).  So \[\begin{align*}f^{-1}(445) &= \frac{445-15}{86}\\  f^{-1}(445) &= \frac{430}{86}\\ f^{-1}(445) &= 5 \end{align*}\]

What is the meaning of the result? Since \(c\) is credits and \(t\) is tuition cost in dollars, the result must mean that \(445\) credits cost \(\$5\). This statement is false because credits cost \(\$86\) per credit hour! To make sense of  \(t=f^{-1}(c)\), we would have to change the meaning of the variables \(c\) and \(t\).

The confusion is easily remedied by applying an alternate strategy to finding the inverse. The strategy of solve for the dependent variable is demonstrated in the following example. As stated earlier, \(t\) represents the total tuition cost in dollars and \(c\) represents the number of credit hours assigned. For the inverse function \(f^{-1}\), \(c\) is the dependent variable so we solve the equation for \(c\).

Given \(f(c)=86c+15\), find \(f^{-1}\).

\[\begin{align*}  f(c) &= 86c+15\\ t &= 86c+15\quad \text{since}\ t=f(c)\\ t-15 &= 86c\\ c &=\frac{t-15}{86}\\ f^{-1} (t) &= \frac{t-15}{86} \end{align*}\]
Note that \(t\) is the independent variable and \(c\) is the dependent variable for the inverse function. \( f^{-1} (445)=5\) implies that when \(t=445\), \(c=5\). In other words, when the total tuition cost (including registration) is \(\$445\), then 5 credits are purchased. This statement is true.

By referring to basic inverse function concepts, we can also detect the fallacy in the statement, “The inverse of \(y=f(x)\) is \(y=f^{-1}(x)\).” Let \(x\) be the independent variable and \(y\) be the dependent variable of a function \(f\). Then \(y=f(x)\) . We know \[\begin{align*}f^{-1} (f(x)) &= x\quad \text{Concept 2}\\ f^{-1} (y) &= x\quad \text{since}\ y=f(x)\end{align*}\]

Notice that the independent variable for the inverse function \(f^{-1}\) is \(y\) and the dependent variable is \(x\). So the inverse of \(y=f(x)\) is \(x=f^{-1}(y)\) not \(y=f^{-1}(x)\).

The tuition example represents a traditional exercise where students focus only on a memorized procedure. Carlson and Oehrtman warn that “this procedural approach to determining ‘an answer’ has little or no real meaning for the student unless he or she also possesses an understanding as to why the procedure works (2005).” The conceptual weakness of the problematic approach to finding the inverse becomes clearly evident with functions representing real world contexts.

Keeping track of the meaning of variables is essential when working with exponential and logarithmic functions. Understanding that \(y=b^x\) is equivalent to \(\log_b y = x\) is key to understanding logarithms conceptually. The switch \(x \) and \(y\) approach to finding inverses obscures the inverse relationship between exponential and logarithmic functions. For example, suppose that \(f(x)=3^x\). Find \(f^{-1}\).

\[\begin{align*} \textit{Switch x}\ &  \textit{and y}\ \text{approach} & \textit{Solve for the}\ & \textit{dependent variable}\  \text{approach} \\ f(x) &= 3^x & f(x) &= 3^x\\  y &= 3^x & y &= 3^x\\ x& =3^y\quad \text{switch}\ x\ \text{and}\ y & \log_3 y &= x\\ \log_3 x &= y & f^{-1}(y) &= \log_3 y \\ f^{-1}(x) &= \log_3 x & & \\ \end{align*}\]

Using the switch \(x\) and \(y\) approach, it is common for students to conclude incorrectly that \(\log_3⁡ x=3^x\) because of the statements \(\log_3⁡ x=y\) and \(y=3^x\) included as part of the problem solving process. No such confusion exists when the solve for the dependent variable approach is used.

Problematic Conception #2: With the horizontal axis representing the independent variable and the vertical axis representing the dependent variable, the graphs of \(f\) and \(f^{-1}\) may be drawn on the same axes. The resultant graphs are symmetric about the line \(y=x\).

It is true that the graphs of \(y=f(x)\) and \(y=f^{-1} (x)\) are symmetric about the line \(y=x\) but, as established earlier, there are inherent issues with saying that \(y=f^{-1} (x)\) is the inverse function of \(y=f(x)\). The result \(y=f^{-1} (x)\) comes from switching the \(x\) and \(y\) variables in the inverse function. In fact, switching the variables in any mathematical relation will create a graph that is symmetric about the line \(y=x\). The practice of graphing \(f(x)\) and \(f^{-1}(x)\) on the same axes should be avoided (VanDyke, 1996) because it muddles the concept of inverse. Instead \(f(x)\) and \(f^{-1}(y)\) should be graphed on separate axes labeled appropriately with \(x\) or \(y\) on the horizontal axis.

The conceptual problems which occur when graphing \(f(x)\) and \(f^{-1}(x)\) on the same axes are evident when modeling even the simplest real-world context. The weekly earnings, \(y\), of an employee earning \(\$10\) per hour who works \(x\) hours in a week is given by \(y=10x\). The independent variable for the function \(f\) is \(x\) and the dependent variable is \(y\). For the inverse function \(f^{-1}\), the dependent variable is \(x\) so we solve \(y=10x\) for \(x\) and get \(x=\frac{1}{10} y\). We have \(y=f(x)\) with \(f(x)=10x\) and \(x=f^{-1} (y)\) with \(f^{-1} (y)=\frac{1}{10} y\). If we switch the \(x\) and \(y\) variables in the inverse function equation, we get \(y=f^{-1} (x)\) with \(f^{-1} (x)=\frac{1}{10} x\) . We graph \(f(x)\) and \(f^{-1} (x)\) on the same axes and label the axes with the variables \(x\) and \(y\) as is customary. We include the units associated with the variables \(x\) and \(y\).

Graphing \(y=f(x)\) and \(y=f^{-1}(x)\) on the same set of axes.

Graphing \(y=f(x)\) and \(y=f^{-1}(x)\) on the same set of axes.

From the graph, we see that \(f^{-1} (20)=2\). The \(x\)-axis is labeled hours worked weekly and the \(y\) axis is labeled weekly earnings (dollars) so this must mean that when the employee works \(20\) hours the employee earns \(\$2\). But this doesn’t make sense because we know the employee makes \(\$10\) per hour! We could remove the labels from the axes, but this does not help someone understand a function’s graph as a visual representation of a relationship between two quantities and is likely to make it even harder to comprehend the meaning of a point on the graph. Graphing \(y=f(x)\) and \(y=f^{-1} (x)\) on the same axes created confusion and did nothing to help us understand inverse functions.

There are two equally viable strategies for representing functions and their inverses graphically. The first strategy is to graph each function on its own pair of coordinate axes with the horizontal axis representing the independent variable and the vertical axis representing the dependent variable of the function.

Graphing \(y=f(x)\) and \(x=f^{-1}(y)\) on separate axes

Graphing \(y=f(x)\) and \(x=f^{-1}(y)\) on separate axes

From the graph of \(f\), we determine that \(f(2)=20\) means that working 2 hours weekly results in weekly earnings of \(\$20\). From the graph of \(f^{-1}\), we determine that \(f^{-1} (20)=2\) means that when weekly earnings were \(\$20\) the number of hours worked was \(2\). Both results make sense in the real-world context.

The second strategy for graphing a function and its inverse comes from changing the way we think about graphs. With this approach, we use the same graph to represent a function and its inverse but designate the horizontal axis to represent the independent variable for \(f\) and the vertical axis to represent the independent variable for \(f^{-1}\) (Moore, Liss, Silverman, Paoletti, LaForest, & Musgrave, 2013). Observe that to determine \(f(2)\) we start at \(x=2\) on the horizontal axis and move vertically until we touch the graph of \(f\). We then move horizontally until we touch the vertical axis at \(y=20\). We conclude \(f(2)=20\). To determine \(f^{-1} (20)\) we start at \(y=20\) on the vertical axis and move horizontally until we touch the graph of \(f^{-1}\). We then move vertically until we touch the horizontal axis at \(x=2\). We conclude \(f^{-1} (20)=2\).

Using the same graph to represent a function and its inverse but designate the horizontal axis to represent the independent variable for \(f\) and the vertical axis to represent the independent variable for \(f^{-1}\)

Using the same graph to represent a function and its inverse but designating the horizontal axis to represent the independent variable for \(f\) and the vertical axis to represent the independent variable for \(f^{-1}\)

This way of thinking can be powerful for students who recognize the equation \(f(x)=30\) is equivalent to \(x=f^{-1} (30)\). The student finds \(30\) on the vertical axis and determines the corresponding value on the horizontal axis is \(3\). The student concludes that the solution to \(f(x)=30\) is \(x=3\) because \(f^{-1} (30)=3\).

Bayazit and Gray (2004) claim that learners with a conceptual understanding of inverse functions were able to deal with the inverse function concept in situations not involving formulas whereas learners limited by a procedural understanding of inverse functions (e.g. switch \(x\) and \(y\)) were less likely to be successful in a context without a formula.

A side benefit of discarding the switch \(x\) and \(y\) approach is that it frees learners from the \(x\)-addiction – the notion that only \(x\) can be the independent variable. In graphing, the \(x\)-axis becomes the horizontal axis and the \(y\)-axis becomes the vertical axis. The reality is that disciplines outside of mathematics rarely use \(x\) to represent the horizontal axis and \(y\) to represent the vertical axis. Rather, they use variable names (perhaps even complete words) that make sense in the context of the situation. Since, as we propose, the axes are no longer tied to \(x\) and \(y\), learners think more deeply about the concepts of independent and dependent variables when graphing real world data models such as \(p=f(t)\) where \(f(t)=298,213,000(1.009)^t\) and \(\textit{height}\ = f(\textit{time})\) where \(f(\textit{time})=-8.99 \cos⁡(\frac{\pi}{6}\cdot \textit{time})+12.74.\)

When students understand the concept of inverse function in the context of a real world situation, they engage in reasoning (the process of drawing conclusions on the basis of evidence or stated assumptions (NCTM, 2009)) and sense making (developing understanding of a situation, context, or concept by connecting it with existing knowledge (NCTM, 2009)). This connects directly with the Standards for Mathematical Practices – specifically Math Practice #1 (make sense of problems and persevere in solving them) and Math Practice #2 (reason abstractly and quantitatively) (National Governors Association, 2010). The Mathematical Association of America encourages similar ways of thinking in their Committee on the Undergraduate Program in Mathematics Curriculum Guide (MAA, 2015). Cognitive Recommendation #1 states that Students should develop effective thinking and communication skills. All such connections help students understand and retain new information, something that is more challenging if students are not engaged in reasoning and sense making (Hiebert et al., 1997).

A correct understanding of inverse functions empowers learners mathematically. By eliminating the switch \(x\) and \(y\) approach and implementing the solve for the dependent variable approach, teachers can reduce confusion and enhance student understanding. By recognizing that the inverse of \(y=f(x)\) is \(x=f^{-1}(y)\), learners can make sense of inverse functions in multiple mathematical contexts including real world data analysis and modeling.

Adapted from an article by the same authors, listed in the references below.

Bayazit, I. and Gray, E. (2004, July). Understanding inverse functions: the relationship between teaching practice and student learning. Proceedings of the 28th Conference of the International Group for the Psychology of Mathematics Education: Vol. 2. (pp. 103–110).

Carlson, M. & Oehrtman, M. (2005). Key aspects of knowing and learning the concept of function. Mathematical Association of America Research Sampler, No. 9, March 2005.

Hiebert, J., Carpenter, T., Fennema, E., Fuson, K., Wearne, D., Murray, H., et al. (1997). Making sense: Teaching and learning mathematics with understanding. Portsmouth, NH: Heinemann.

Mathematical Association of America (2015). 2015 CUPM Curriculum Guide to Majors in the Mathematical Sciences. Carol S. Schumacher and Martha J. Siegel, Co-Chairs, Paul Zorn Editor. Washington, DC: MAA

Moore, K. C., Liss II, D. R., Silverman, J., Paoletti, T, Laforest, K. R., and Musgrave, S. (2013). Pre-Service Teachers’ Meanings and Non-Canonical Graphs. In Martinez, M. & Castro Superfine, A. (Eds.), Proceedings of the 35th annual meeting of the North American Chapter of the International Group for the Psychology of Mathematics Education (pp. 441-448). Chicago, IL: University of Illinois at Chicago.

National Council of Teachers of Mathematics (2009). Focus in High School Mathematics: Reasoning and Sense Making. Reston, VA: NCTM

National Governors Association Center for Best Practices, Council of Chief State School Officers (2010). Common Core State Standards – Mathematics. National Governors Association Center for Best Practices, Council of Chief State School Officers, Washington D.C.

United States Census Bureau. (2006). Table 96. Expectation of Life at Birth, 1970 to 2003, and Projections, 2005 and 2010. (NTIS No. PB2006500023)

Van Dyke, F. (February 1996). The inverse of a function. Mathematics Teacher. 89, pp. 121 – 126.

Wilson, F. (2007). Finite mathematics and applied calculus. Boston: Houghton Mifflin Company.

Wilson, F.C., Adamson, S., Cox, T., and O’Bryan, A. (March 2011). Inverse functions: What our teachers didn’t tell us. Mathematics Teacher. 104, pp. 500-507.

World Health Organization. (2006). World Health Statistics 2006. WHO Press. Geneva, Switzerland.

This entry was posted in Classroom Practices and tagged , . Bookmark the permalink.

41 Responses to Inverse Functions: We’re Teaching It All Wrong!

  1. Scott Adamson says:

    There are many examples of this in undergraduate mathematics…what I mean by this: showing students how to do something so that they can get the right answer on the test “on Friday” but does not produce productive ways of thinking that are useful for making sense of future mathematical ideas.

    For example: ask adults (family, neighbors, etc.) “What does average mean? For example, Larry Bird averaged 24.3 points per game over his career. What does this mean?”

    People will say…”it means that if you add up all the points he scored and divide by how many games he played in, you will get 24.3.”

    This does not explain the meaning of the 24.3 ppg…it only tell how to compute the average! Once this number is computed, how do we explain what it means?

    Students think mathematics is mostly about doing a procedure or computation…and their job is to remember which procedure to do and when to do it to produce the right answer.

    We hope to encourage a discussion about shifting this student belief to a focus on meaning, sense making, and understanding so that procedure and computations make sense and are implemented with purpose and meaning!

    • Howard Phillips says:

      “Students think mathematics is mostly about doing a procedure or computation…and their job is to remember which procedure to do and when to do it to produce the right answer. ”
      Of course they do, this is how they were taught, and this is how their teachers were taught, and this is how tests are taught. (a lot of each of them).

  2. Bruce Aubertin says:

    “People will say…”it means that if you add up all the points he scored and divide by how many games he played in, you will get 24.3.””
    “This does not explain the meaning of the 24.3 ppg…it only tell how to compute the average! Once this number is computed, how do we explain what it means? “”

    I wonder how you Scott would answer your own question, or hope that a grade 12 student would answer the question. I asked some statisticians in our department to answer the question and got answers like “balance point of the distribution.” I disallowed answers like “typical value” or “central value” which would not distinguish it from other central measures. Expected value? I’m really curious, and don’t know myself how yet how I would explain this deeply intuitive concept of average. At least the answer telling how it is calculated is superior to the absolute nonsense of thinking interchanging x and y in a relation gives you anything other than changing the names of the variables! I asked my class once to give the inverse function of C=5/9(F-32) (good for a Canadian visiting Seattle) so it would be good for an American friend visiting Vancouver. Sure enough, some of them interchanged C and F, how confusing is that!

    • Trey Cox says:

      Hi Bruce,

      I love the discourse that our article has generated! How fun! You have a great question about average and one that we get quite often by mathematicians, math educators, and students! We encourage people to think about what the procedure for determining the average represents and that helps one reason through what the numerical value that they have calculated actually means.

      For example, assume that we wanted to calculate the average amount of money the students in one of your classes has with them on a particular day. If we literally carried out the process of determining the average we could ask everyone to come forward and dump out all of their money from their purses and wallets on a desk (add up all of the money) and then we would proceed to distribute the money to each student one cent/dollar at a time (divide by the number of students). What would the result be of this process? Every student would have the exact same amount of money! Therefore, the average amount of money in the class would mean that IF every student had the same amount of money they would all have $x.xx.

      Going back to Scott’s analogy…”Larry Bird averaged 24.3 points per game over his career. What does this mean?” Using the reasoning we just demonstrated with the classroom money analogy, we can see that an average of 24.3 ppg would mean that IF Larry Bird had scored the exact same amount of points in each game he played over his career he would have scored 24.3 points.

      Finally, if the average score of the students on your last Calculus exam was 78% that tells us that IF all of the students in your course had all scored the same they all would have scored 78%.

      As you can see, the average is “the great equalizer” or the “equal distribution” of values and is a hypothetical value (IF…).

      Having this kind of conceptual understanding of what average (or mean) means, helps one with many other mathematical concepts such as the mean value theorem, average value of a function, etc.

      The author team heavily promotes helping students (and teachers) to make sense of the big mathematical ideas and then have the procedures naturally emerge from that understanding. Both our article on inverse functions and Scott’s average example demonstrate what we are trying to achieve.

      All the best!


      I hope this helps!

      • Alan Cooper says:

        Well, I think we all would agree that the equal sharing problem is one possible motivating application for the concept of average, but not one that is particularly well suited to the points-per-game situation (where something based on the expectation of a betting game might be more appropriate). If the goal is to get people to think in terms of interpretation rather than process, then perhaps asking what they would *use* the average for might generate less procedural responses than asking what it “means”. But in any case I agree with Bruce’s suggestion that the procedural approach to saying what an average means is probably far less likely to lead to misunderstandings than some of the stuff that gets written about inverse functions.

        • Scott Adamson says:

          Thanks for the comment, Alan!

          If you have time, I am interested to hear more about what you mean when you say, “where something based on the expectation of a betting game might be more appropriate”.

          How would you explain the Larry Bird 24.3 points per game situation based on “the expectation of a betting game”?

          We would say that the equal sharing situation works if we consider the average to be a hypothetical value. IF Larry Bird scored the same number of point in every game of his career, that number would need to be 24.3 in each game in order to create the total number of points actually seen in the number of game actually played.

          This allows us to compare Larry’s career to say, Magic Johnson’s career.

          Also, I think what we mean when we say “what does it mean” is to articulate an interpretation. A computation is not a meaning…but an interpretation gets at the meaning.

          Thanks again!


          • Alan Cooper says:

            Yes Scott, I too think that it’s important to encourage students to have well-founded motivations for what they do and believe. So I very much appreciate your effort to avoid (or at least supplement) rote procedures. But another pet peeve of mine is that while we often criticize (and even penalize) students for imprecise use of language, teachers also often use imprecise language and unstated conventions themselves – which amplifies the sense of resentful unfairness which many people seem to feel towards our subject. I know (and appreciate) what you mean by “meaning”, but the person who interprets it more narrowly and procedurally is not necessarily *wrong*.

            With regard to the average score business, since “sharing” points between games is a bit of a stretch (especially since hypothetical score of 24.3 could only occur in a universe where the game was completely different!), I would suggest the game of selecting a game at random from Larry Bird’s record and winning a number of dollars equal to his score in that game. The average score is then what might be called a fair price to charge for playing that game (in the sense that over the long term wins and losses would be expected to balance out).

          • Scott Adamson says:


            I think that this is a case where the procedure (or computation) and the interpretation (or meaning?) can work together to help make sense of the 24.3 points per game situation. (I don’t need to tell you how to compute the average, but bear with me…) To compute the 24.3 points per game, we would start by summing up the number of points scored in each of the many, many games Larry Bird played. This total sum is the the grand total of all the points Larry scored over his career. Now we equally distribute these points to each game he played….as I like to say with my students…”it’s as if…” It’s as if Larry scored the same number of points in each game. That’s what we are really saying when we distribute that total number of points scored equally among the number of games he played. It is hypothetical. It cannot really happen.

            This happens often with average.

            I just googled the average class size in elementary schools in Arizona. It is (according to 23.6 students. Of course, the idea of 0.6 students makes no sense. But, IF every elementary classroom in Arizona had the same number of students…if we equally distribute all of the elementary students into all the available classrooms, each classroom would hypothetically get 23.6. The reality is some would get 24 and some would get 23. But this average, as hypothetical as it is, allows us to compare with other states.

            I don’t know about bringing winning money into the interpretation of the 24.3 points per game since we are not talking about money at all…just points…and equally distributing them.

            How about this…I have 4 kids…Trey has 3…on average, we have 3.5 kids! It’s as if…it’s a hypothetical…it’s imaginary…but if we equally distribute our 7 kids, we each get 3.5 kids, on average!

            I enjoy talking about this…I hope you do too!


          • Alan Cooper says:

            Yes, I enjoy this too. If I didn’t, then I hope I would have the sense to stop doing it!
            And your “as if” language for explaining what the average “means” is fine with me (and also, I suspect, with Bruce too, though I shouldn’t claim to speak for him).

            The only thing I think we were objecting to is the implication that those generic adults who gave the procedural definition were unaware of anything more than that, or that just giving the procedure should be faulted as an explanation of the meaning if the questioner hasn’t made it clear that he or she is looking for some “interpretation” rather than a mere definition. Teachers should certainly do more, but in the case of averages those who just give the procedure aren’t doing any actual harm and aren’t leaving the student far from being able to come up with the sharing interpretation on their own account.

            The difference in the case of inverse functions is that a purely procedural approach may actually lead to serious misconceptions and leaves the average student far from any real understanding of what it’s all about.

    • Brian says:

      This precise issue happened with the Fahrenheit/Celsius conversion problem right here in this reddit thread:

  3. Sheldon Axler says:

    I completely agree with the main point made by the authors: it is a terrible idea to teach students to find the inverse of a function f by starting with y = f(x), interchange the roles of x and y, and then solve the equation x = f(y) for y. Instead, stay with the equation y = f(x) [or whatever variables are being used] and solve this equation for x, getting an expression for f^{-1}(y). The key point is that students should understand that y = f(x) is equivalent to f^{-1}(y) = x; this understanding is unlikely to occur if the variables are interchanged. Furthermore, as pointed out by the authors, it makes no sense to interchange the variables when letters for the variables have been chosen to reflect real world-problems. My Precalculus textbook is one of the few textbooks at this level that does not use the poor idea of interchanging the variables to find an inverse function.

    • Frank Wilson says:

      Thanks, Sheldon, for your efforts as an author in promoting this way of thinking about inverse functions. When we wrote Precalculus: A Make It Real Approach, we too observed that most precalculus textbooks included the switch-x-and-y approach instead of the solve-for-the-independent variable approach. It is our hope that other authors will embrace the inverse function strategy you and I have embraced.

  4. abel says:

    i agree with most of what is said. i have students unable to compute $f^{-1}(1/3)$ but can compute $f^{-1}(x)$ correctly. i tested on this on my last test with $f(x) = frac{x}{1+x}$

    • Frank Wilson says:

      Hi Abel,
      I understood you to say that your students were able to determine f^-1(x) but not f^-1(1/3) when you used the function f(x) =x/(1+x). If y = f(x), then f^-1(y) = y/(1-y). (As explained in the article, f^-1(x) is nonsensical.) If I understood you correctly, the student was unable to calculate f^-1(1/3) = (1/3)/(1-1/3) which simplifies to 1/2. That is, when y = 1/3, x = 1/2.

      If the student was able to correctly determine the inverse function but unable to evaluate the inverse function a y=1/3, I believe the gap in the student’s skill is not in the determination of the inverse but rather in simplifying rational expressions involving fractions.

      Just out of curiosity, which portion of the article did you disagree with?

      • Art Duval says:

        abel: I love that assessment question.

        Perhaps the gap in the student’s skill is not about rational functions but about not knowing what functions mean. For instance, perhaps s/he could compute f^-1(y) algebraically, but doesn’t realize that you can then just plug in 1/3. Or perhaps, s/he *only* knows how to compute inverses when the problem is presented in exactly the same form it always is; then posing the problem with this slight variation is the monkey wrench.

        abel, I would be very curious to hear which of these things caused your student(s) trouble, or if it was something else altogether.

      • Alan Cooper says:

        Hi Frank,
        Whether or not, as you say, “f^-1(x) is nonsensical”, depends on how the function concept is being interpreted.
        In a physical or other applied context one might think of a function as a relationship between two different quantities, as in Boyle’s Law where P=f(V)=k/V, and V=f^-1(P)=k/P, with f only accepting volume values and f^-1 accepting only pressure values. In such a context one might well say that “f^-1(V) is nonsensical”, but in the mathematical context we usually interpret a function as a relationship between numbers without regard to their interpretation (in which case Boyle’s f and f^-1 happen to be identical).

        With the mathematical interpretation, if f^-1 is defined by
        f^-1(y) = y/(1-y) for all acceptable values of the input y,
        then f^-1(1/3) = (1/3)/(1-(1/3))=1/2
        f^-1(1/2) = (1/2)/(1-(1/2))=1
        f^-1(whatever) = (whatever)/(1-(whatever))
        and so on.
        So without any problem we also have f^-1(x) = x/(1-x).

        • Frank Wilson says:

          Hi Alan,
          I agree with you we can substitute any variable or value into a function formula. In that sense, f^-1(x) = x/(1-x) is viable just as f^-1(@) = @/(1-@) and f^-1(#) = #/(1-#) are viable. The dilemma comes when we seek to describe the relationship between a function f and a function f^-1. As soon as we seek to describe that relationship, we have to consider the relationship between the respective domains and ranges of the functions. I believe my coauthors have some additional insight on this issue.

          • Bruce Aubertin says:

            For the interchange people, to find f^-1(1/3) for y=f(x)=x/(1+x),
            first write y=f(1/3)=1/3/(1+1/3), then interchange y and 1/3 and solve for y.
            Voila, y=1/2.
            I’m joking, of course. It was not clear from Abel’s comment if he asked on the test for (a) f^-1(x) followed by (b) f^-1(1/3) (which is puzzling if they could do (a) and not (b)), or if he asked for (b) by itself and students couldn’t do it (whereas they had shown they could do (a) when the had a y and an x to work with.

          • Scott Adamson says:


            I am interested in your comment about computing inverse functions in a mathematical context, the symbols used to represent variables really don’t matter. If f(x)=x/(x+1), then f^-1(x)=x/(1-x). Your claim is that, void of context, “without any problem we have f^-1(x)=x/(1-x).

            I actually see problems!

            1. If we allow students to express the inverse as claimed, without regard for the symbols used to represent the variables, then we are limiting their understanding of inverse functions to just be something that needs to be done. Students might think, “I remember…inverse functions is that thing where we switch the x and y and solve for y and then call the result f^-1(x).” That’s it…end of story (for some students).

            I work hard each and every day in my classroom to help students to make sense, reason with mathematics, problem solve, etc. One of the biggest obstacles to student development of the above is their belief that mathematics is something to do. They do first without regard to why their doing it, what it means to do that, if it’s even relevant to do it.

            For example, students in my Calculus 1 class recently worked through an activity to develop what we called the “total change principle” which is the fundamental theorem of calculus. Not long after, they were given a velocity vs. time graph and asked to find the distance traveled. This velocity vs. time graph was a decreasing linear function. To get the distance traveled, they only needed to find the area of a triangle. But instead, I find students making up weird ways to use the fundamental theorem to answer the question. They used the graph to find v(b) and v(a) and then subtracted v(b) – v(a) and called that the distance traveled. They don’t think and reason and then act to do what makes sense, they just do…and often, they do whatever it was that they most recently learned…and they will do so incorrectly as in this case!

            My point is that some students think that their job is to DO whatever the teacher just showed them to do…no matter the context or situation. I told them…it’s like someone is learning to use a tool, like a hammer, for the first time. They then try to use that hammer for every job. If they need to cut a two-by-four, they whack it with their hammer. Ok, maybe that’s a stretch, but in my example, trying to impose the total change principle is like whacking the board with a hammer hoping it will cut in just the way it needs to be cut.

            2. With or without a context, when we say “y = f(x)”, we are trying to communicate a relationship between two quantities, x and y. We can visually represent that relationship with a graph. By traditional convention, we keep track of the “x” quantity on the horizontal axis and the “y” quantity on the vertical axis. In the example, y=f(x) where f(x) = x/(x+1), we see that x = 2 is paired with y=2/3.

            Now we consider the inverse, x=f^-1(y) where f^-1(y)=y/(1-y). We see that when y=2/3, x=2. This is the same ordered pair we see with f(x)! That makes sense and keeps the relationship between the quantities x and y clear.

            If, on the other hand, we ignore any relationship between quantities and say y=f^-1(x) where f^-1(x)=x/(1-x), when x=2, y=-2. Hmmmm….

            3. If we are consistent in our teaching, with or without a real-world context, students have a productive way of thinking that is useful in all cases. If they are thinking about relationships and quantities as demonstrated in item #2 above (and in the article), they can use that same way of thinking in a real-world context and still be in good shape…everything makes sense!


          • Alan Cooper says:

            Hi Scott,
            I find it interesting that you see problems with going from f^-1(y)=y/(1-y) to f^-1(x)=x/(1-x).
            I see more problems with telling students that certain functions can only be applied to certain variable names. Would you say that ln and exp are inverse functions only when one has independent variable denoted by x and the other by y? [By the way, note that I omitted the variables not to avoid the problem but because to include them is wrong. ln(x) is not a function. If anything it is the value of that function when the input is “x”. But until x has been defined it is a meaningless expression, and it cannot even be part of a compound expression unless x has been either specified or quantified. – as in “If x=e then ln(x)=1”, or “for all x>0, ln(x)<x”, or “for every real y there is an x with y=ln(x)”.]

            The root of many problems I see people having with functions is due to having expressions written in terms of unquantified variables (or worse with an assumed quantifier that is not stated and which students are penalized for not assuming). Without some information about what values of each variable are being considered, “y=f(x)” is meaningless. It is true for some pairs (x,y) and not others. For points where y=f(x), and for which there is there is no other such point with the same y-value, we can say x=f^-1(y). And for points where x=f(y) with no other such point having the same x-value we can say of the unique y that works that y=f^-1(x).

            Here's how I would respond to the perceived problems in your list:

            1. " If we allow students to express the inverse as claimed, without regard for the symbols used to represent the variables, then we are limiting their understanding of inverse functions to just be something that needs to be done. "-
            On the contrary, if we define a function f by “f(x) = x/(x+1)” then we should be sure that the students understand that this is just shorthand for “let f be defined by f(x)=x/(x+1) for all real x (other than -1)”. It should then be clear to them that no matter what is put in place of x the same pattern applies. So f(y)=y/(y+1) and so on. Until we have brought them to this level of understanding it is downright criminal to even start to talk about inverse functions!

            And once we do have a basic understanding of functions, I would avoid the “interchange” ritual by never asking a nonsensical question like “find the inverse function to y=f(x)=x/(x+1)”. Instead I would ask “find a formula for f^-1(x) where f is defined by f(x)=x/(x+1)”(having already clarified the convention about omitting the quantification and restriction on x). Then I would expect a solution that either begins with “IF y=f(x) then x=f^-1(y), but if y=f(x) then y=x/(x+1), so (solving) x=y/(1-y), so f^-1(y)=etc, etc, etc.” OR (better) “LET y=f^-1(x). Then x=f(y)=y/(y+1). So (solving) we get y=x/(1-x). So f^-1(x)=x/(1-x)”.
            If they have any sense (and/or have not been confused by any previous teaching of the “switch” nonsense) then they choose the latter – and never even think of “switching” the variables.

            The only place “switching” comes in is in the graphing. Where, after giving them the conventionally drawn graph of an invertible function, f, (labeled as y=f(x)), they are asked for the graph of the inverse function. When they are done, I tell them that working it all out was fine, but with that wording of the question all they really had to do is use the same graph and label it as x=f^-1(y).[just as you do in example 2 of the article!] Then I tell them to draw the graph of y=f^-1(x). And after letting them fuss with it give as “my” solution just crossing out the x and y labels and switching them. Finally, I ask for the graph in the usual orientation – and then “my” solution is just to flip the transparency over its diagonal.

            2."With or without a context, when we say “y = f(x)”, we are trying to communicate a relationship between two quantities, x and y. …..where f(x) = x/(x+1), we see that x = 2 is paired with y=2/3. -….If, on the other hand, we ignore any relationship between quantities and say y=f^-1(x) where f^-1(x)=x/(1-x), when x=2, y=-2. Hmmmm…. "

            Why is this puzzling? Yes, y=f(x) defines a relationship. But y=f^-1(x) is a *different* relationship so in this relationship x=2 is associated with a different value of y.

            But I think the business of context deserves further comment: As in your hours and dollars example or in the Boyle's Law example that I mentioned in response to Frank, many practical applications involve functions not from R to R but between two distinct isomorphic images of R (with different units attached). And in those cases, where the inputs are not actually numbers but physical quantities, the types of input to f and f^-1 are different and so if f is the function giving pressure P in terms of volume V with P=f(V) then f^-1(V) doesn't make sense since the input of f^-1 has to be a pressure (which includes both a number and a unit). And I appreciate the fact that , in your article, you mentioned the different units and made a point of including them in the axis labels.

            However, just so as not to seem too agreeable, I would say that IF we introduce specific units and define P and V not as the actual pressure (eg P=3bar) and volume (eg V=1.5m^3), but rather as the numbers of units (ie P=3 and V=1.5) and define F(V) as the function which gives the number of bars corresponding to a volume of V cubic meters, THEN in fact F^-1(V) is a perfectly well-defined expression (though not physically interesting except when V is actually the number of bars of pressure in the sample of gas). Similarly, in your example IF f(x) is the actual money earned in time x then f^-1(x) doesn't make sense, but if f(x) is just the number of dollars earned in x hours, then x is a number not a time, and f^-1(x) is the number of hours required to earn x dollars.

            3. I think this was more a point than a problem. And it's one I agree with.

            I hope it's clear by now that I found very little to quarrel with in your article. Just with the idea that “f^-1(x) is nonsensical”, and some quibbles about distinguishing between a function, it's values, and the equation of it's graph.


          • Scott Adamson says:


            I do not find our discussion as a quarrel and appreciate the opportunity to have a good mathematical discussion! Thanks!


  5. William Kelleher says:

    I have taught calculus at the high school and college levels. I just spent a few days trying to come up with the answer to: “why are we studying inverses?”

    I t appears to me that many calculus texts just sort throw it out there, or maybe it is listed as a miscellaneous topic. My opinion is that if I cannot answer this question when asked by students then why present it?

    My favorite way of understanding this is graphically. This is done by rotating f(x) by -pi/2 and then mirror this about the x axis. Has anyone seen this approach in texts?

    • Alan Cooper says:

      Is there a reason for preferring the rotation and reflection over just reflection across the diagonal?

      • William Kelleher says:

        Only that it is easier for me (maybe not others) to see a reflection around a coordinate axis.

    • Scott Adamson says:

      Hi William,

      You ask a good question…”why are we studying inverses?”

      Let me first address the second issue of rotating and reflecting…and then connect it to your first question.

      We argue that we shouldn’t rotate and reflect anything…anywhere…when working with inverse functions. I don’t want to repeat the argument in the article here, but put shortly, quantities matter!

      And if there is a question about “why are we studying inverses” in the first place, then there certainly is a question about why we are reflecting and reversing without paying attention to the meaning of the quantities.


      • William Kelleher says:

        Ok. Understood. The question then becomes why are we reflecting and rotating? Either way- whether it is to give some geometric feeling for the inverse, or simply to reverse the order of cause and effect, what are the quantities you refer to? Teaching calculus to college freshmen – in an effort to motivate how this has relevance to the real world – Planck’s real world- what is it?

        • Scott Adamson says:

          Functions allow us to make sense of relationships between quantities. For example, in the article, we use the context of tuition and fee payments.

          We think about the horizontal axis as a number line that keeps track of the quantity “total number of credits enrolled”. The vertical axis is a number line that keeps track of the quantity “total amount of money owed for tuition and fees”.

          Thus we say, T = f(c) where f(c) = 15 + 86*c where T is the total cost of tuition and fees and c is the number of credits enrolled.

          To reflect or rotate will eliminate this relationship.

          So the function takes c (number of credits) as the input and outputs the total cost.

          The inverse function, c = g(T) where g(T) = (T – 15)/86, takes the total cost of tuition and fees as the input and outputs the number of credits enrolled.

          Here is more reading that might be interesting:

          And an activity focused on the idea of tracking quantities covariationally.


    • Alan Cooper says:

      My response to your first (an more important) question is that the study of inverse functions is really just another name for solving equations (or at least describing how the solutions behave if we don’t have explicit formulas for them). And the practical application of this is when we have some theoretical understanding of how one variable, h (say position of some object or marker) depends as a function of some other variable, t (such as say time or temperature), but want to use observations of h to determine t.

      As one who grew up without the dubious advantages of having a digital everything, my favourite examples of this are:

      1)”telling the time” where h is the angular position of the hand(s) of a clock (or length of a burning candle, or height of a pile of sand, or column of water or etc etc etc) which depends in a known way on the time, and our usual goal is to “invert” the function and determine the time t from the observation of h;
      2)”reading a thermometer” where the length of the mercury column is a known function of the temperature and we want to use the observed length to find the temperature.

      • William Kelleher says:

        Thank you. I am old enough to have used a slide rule (have 5-6 in a collection). Most texts present a linear equation which is relatively easy to solve for x or y. I am looking for a more complicated example? In other words, there has to be a good example between the extremes of a linear equation and adjoint theory!

        • Bruce Aubertin says:

          I wish I had kept my slide rules too! To answer your question, what about the exponential function exp(x). Its inverse function is the natural log function ln(x).
          That is, y = exp(x) if and only if x=ln(y), probably the most famous and most important inverse pair of all.

        • Alan Cooper says:

          Here are some nonlinear examples from reasonably common semi-practical situations:
          Radiometric dating;
          Barometric altimeters;
          Determining the amount of a steadily decreasing annuity with given term from the amount available to buy it;
          Figuring out how high to fill a conical coffee filter with water in order to produce a specified volume of coffee.
          The angle at which to shoot a cannon with given fixed muzzle velocity (or just the height to raise that muzzle) in order to have the cannonball land at a specified distance. There are two solutions here, each of which is an example of the inverse of a function on a restricted domain.

          • Bill Kelleher says:

            This article addresses several issues associated with the “swap x and y approach.”
            1) The geometric interpretation of “mirror about the line y=x”- The reason I constructed a way to rotate by -pi/2 and then mirror about the x axis was in an effort to show that this is true for any x,y space curve. The purpose was to show that the inverse of y=f(x) is not the same equation, it is clearly a different space curve. However, this is only true if the scale and units are the same for the x and y axes. Of course this is implied in calculus texts where x,y are purely geometric representations and dy/dx is unitless. Therefore we could define the inverse as the space curve which results from these operations. But this seems silly!
            2) I agree that these mappings should be shown on different coordinate systems. So if we throw out the idea of same axes, different space curves, then the whole rationale of “swap x and y” (it isnt really a RATIONALE) is gone.
            3) Students (and me) will get confused if they are told to solve for x(y) with the original equation. If viewed as a space curve this is not a different curve, it is the same equation and curve. However, the inverse is not the same curve so “swap x and y” is required.

            I am confused about how to clearly teach this. It seems to me that texts are very confused about presenting inverses. To put this in educatorspeak – “What are the learning outcomes of talking about inverses?”

          • Alan Cooper says:

            In my opinion, most discussions of inverse functions fail for lack of having given proper attention to the prerequisite ideas of functions and relations – both in terms of numbers, and also non-numerical variables such as people (where, eg the birth-mother is a function of the child but not vice-versa), and also to the more basic ideas of naming and quantifying variables and of the distinction between physical quantities and their numerical values in terms of some prescribed units. And of course, at the introductory level (and maybe always!) it is probably best to avoid sloppy “abuses of notation” such as writing y=y(x) with the same symbol y used for two completely different objects!

            With those ideas in place, my suggestion re the learning objectives for talking about inverse functions would be:
            1. Be able to identify a number of practical situations (such as those I listed above) where one variable is determined as a known function of the other and we want to find what value(s) of the “independent” variable correspond to an arbitrary specified result for the “dependent” variable. And use either graphs or tables of values, or equations to solve such problems for particular values of the target “dependent” value.
            2. Understand that a graph in the plane defines *two* relations, either or both of which may or may not be functions, which are “inverse” to one another, that the graphs of xRy and yR^(-1)x are the same, and that that of yRx and xR^(-1)y is just the reflection of the former across its diagonal.[Eg. given a graph identified as any one of the four draw the graph of anther – which will be either the same as the given one or reflected from it. And do the same with x and y replaced by any other symbols.]
            3. In cases where one numerical variable is given in terms of another by a formula of the form y=f(x), use standard algebraic techniques to produce (if possible) an equivalent equation of the form x=f^(-1) (y). [Eg. y=x^3+1 giving x=(y-1)^(1/3) ]
            4. Use ideas 2 and 3 to produce equations and graphs of the composition inverses of functions given either way, and apply these ideas to generalize the solution of problems of type 1 above.
            5. For calculus students note that interchanging x and y to get the inverse function interchanges rise and run and so reciprocates the slope, but this is at a *different* value of the independent variable and leads to the identity f(-1)\'(x)=1/f”(f^(-1)(x)). And note that this result is also a special case of implicit differentiation applied to the equation f(f^(-1))(x)=x.

  6. Mike Fouchet says:

    I teach Algebra 2 in a high school. I taught inverse functions in this way this year, but also mentioned to my students they will see the traditional f^-1(x) notation if they look at other resources. Many of them thought the traditional nonsensical without me even going into a real-world example. When I talked them through an example of y=10x, making y dollars for x hours of work, they were even more outraged that it was taught traditionally like it is. Even my lower ability students saw how dumb it was. Pretty awesome!

    • Scott Adamson says:

      Thanks, Mike for sharing! It is very exciting to hear of students making sense of mathematics! Well done!


  7. JC says:

    Why not redefine the variables for f^-1(x)? It seems to me that students would then understand the relationship between the inverse and function in context and also graphically.

    • Jorge Gonzalez says:

      Exactly what I was thinking. Just explain to students that as x and y are swapped so does their meaning, and avoid graphing the inverse function in the same coordinate axis for the same reason. If you don’t interchange the variables, explain to students that y is now the independent variable(horizontal axis), which might confuse them a little as well ?.

  8. PN says:

    When I teach anything involving variables, I tend to avoid x and y at first in favor of geometric shape symbols like stars or circles or squares or, even, emojis. The point is to understand the concept of variable independently from name. [Mathematica really does a good job of this, emphasizing functions as pattern matching of the left side of a replacement rule and then substitution following the right hand side of the rule . The rule can contain “slots” as independent variables and the sole purpose of naming those slots on the left hand side is to be able to distinguish between them on the right hand side.] I don’t however bring up Mathematica. After awhile, students get tired of drawing emojis or stars and start on their own to use X and Y for convenience of quick writing, but sometimes revert back to “star” or “circle” when they get confused.

  9. William O'Brien says:

    I really like how you snuck in x-addiction.

  10. Anthony Bachelder says:

    First, I want to discuss the Larry Bird problem. You need to take a step back an understand the real world meaning of an average. For Larry Bird it means on any given night he is expected to score at least 24 points. Now we know that he will not score 24 points every night. Some night he mat score more than 24 points, and on other nights he may score less than 24 points. In summary, he is expected to score 24.3 points.
    Second, why you shouldn’t switch x and y. I have to disagree with the author. The concept of switching the x and y carry over to graphing using a table of data. To graph the inverse of the table of data is switch the x and y (f(x)). The graphs give then an excellent example of the concept of inverses.

Comments are closed.