Why Python's is isn't equals

Sun 01 March 2020

It's pretty easy to mix up is and == when learning Python. The difference seems subtle at first and in many cases the two operators seem to be doing the exact same thing.

In [1]:

one = 1

if one == 1:
    print("one == 1")

if one is 1:
    print("one is 1")

one == 1
one is 1

The difference between the two, however, is important and misusing is can lead to perplexing bugs. If you don't read any further, the most important thing to take away is

Never use is to compare a variable to a literal value, always use ==
Using is to check if a variable is None is the only correct usage

The fact that one is 1 works above should be thought of as a quirk in Python's implementation, but incorrect code.

Python's is operator means these two things point to the same location in memory while == means these two things store the same value. This is important because if a is b then changing the state of a also changes b, while no such expectation should be made with ==. It is obvious that if a is b then a == b, but the reverse is not true.

An example to illustrate:

In [2]:

list_1 = [1, 2]
list_2 = list_1

if list_1 == list_2:
    print("list_1 == list_2")

if list_1 is list_2:
    print("list_1 is list_2")

list_1 == list_2
list_1 is list_2

In [3]:

list_1.append(3)

print(f"list_1: {list_1}")
print(f"list_2: {list_2}")

list_1: [1, 2, 3]
list_2: [1, 2, 3]

Note that updating list_1 also updates list_2. We can also see that they have the same id

In [4]:

print(f"id(list_1): {id(list_1)}")
print(f"id(list_2): {id(list_2)}")

id(list_1): 4562732848
id(list_2): 4562732848

Why this is important¶

Given what we've looked at so far, it seems strange that one is 1 above returned True because it implies that the variable one is stored at the same location as the value 1. Indeed, this is exactly what happens and it's an optimization done by CPython when storing booleans, small ints, and short strings. The optimization is most common in storing strings and is called string interning.

This optimization, however, can lead to confusing scenarios like the following:

In [5]:

one = 1
one is 1

Out[5]:

True

In [6]:

one_thousand = 1000
one_thousand is 1000

Out[6]:

False

In [7]:

short_string = "short"
short_string is "short"

Out[7]:

True

In [8]:

not_so_short_string = "not so short string"
not_so_short_string is "not so short string"

Out[8]:

False

Note the difference in behavior depending on the size of the value being stored. Even more confusing, however, is that this is implementation-specific. Switching between 32 and 64-bit CPython (and to a different implementation altogether) yields different behaviors.

I once spent several hours trying to figure out why a script that worked on one Windows PC did not work at all on another. The script was sending data over a serial port so I was quite sure that there was either a hardware issue or a misconfigured setting somewhere. It turned out that the working PC had 64-bit Python installed while the troublesome PC had a 32-bit install and a string comparison (quite far from where I was searching for the issue) using is was behaving differently.

Parting thoughts¶

Most of the conda environments that I have floating around at the moment are Python 3.7. While writing this, I created a new conda environment for testing and learned that Python 3.8 now emits a SynaxWarning on erroneous usage of is. This is an excellent idea and I'm glad that it made its way into CPython. Pylint also provides warnings against this and is yet another reason to use static analysis tools.

I didn't really address the fact that is None is the correct way to do comparisons to None. There is a good reason for it and it's explained in this Real Python post under Taking a Look Under the Hood. The short explanation is that None is an immutable singleton in Python so all copies of it will point to the same thing. Using == None generally also works, although for reasons also outlined in the Real Python post, it is not the same and has a small potential to produce very confusing bugs. This distinction is important but understandably confusing at first.

There are a few other caveats to things that I've written here. For example, it is possible for an is comparison to evaluate to True while an == to evaluate to False by doing something like:

In [9]:

class NeverEquals:

    def __eq__(self, other):
        return False

a = NeverEquals()
b = a

print(f"a is b: {a is b}")
print(f"a == b: {a == b}")

a is b: True
a == b: False

Although this would be quite a strange scenario.

python