Python 变快的9个技巧

2024-01-10 15:46:13

本文介绍9个让 Python 代码变得更快的小技巧。


Small changes, big differences

1. Faster String Concatenation: Choose “join()” or “+” Skillfully

2. Faster List Creation: Use “[]” Over “list()”

3. Faster Membership Testing: Use a Set Over a List

4. Faster Data Generation: Use Comprehensions Over For Loops

5. Faster Loops: Prioritize Local Variables

6. Faster Execution: Prioritize Built-In Modules and Libraries

7. Faster Function Calls: Leverage Cache Decorator for Easy Memoization

8. Faster Infinite Loop: Prefer “while 1” Over “while True”

9. Faster Start-Up: Import Python Modules Smartly

“Python is too slow.”

This sentiment echoes frequently in discussions about programming languages, often overshadowing Python’s numerous strengths.

The truth is, Python is fast if you can write it in a Pythonic way.

The devil is in the details. Experienced Python developers are armed with an arsenal of subtle yet powerful tricks to significantly enhance their code’s performance.

These tricks might seem minor at first glance, but they can lead to substantial improvements in efficiency. Let’s delve into 9 of these approaches, transforming the way you write and optimize Python code.

1. Faster String Concatenation: Choose “join()” or “+” Skillfully

String concatenation will become a bottleneck of your Python program if a large number of strings are waiting to be handled.

Basically, there are two ways of string concatenation in Python:

  • Use the?join()?function to combine a list of strings into one
  • Use the?+?or?+=?symbol to add every single string into one

So which way is faster?

Talk is cheap, let’s define 3 different functions for concatenating the same strings:

mylist = ["Yang", "Zhou", "is", "writing"]

# Using '+'
def concat_plus():
    result = ""
    for word in mylist:
        result += word + " "
    return result

# Using 'join()'
def concat_join():
    return " ".join(mylist)

# Directly concatenation without the list
def concat_directly():
    return "Yang" + "Zhou" + "is" + "writing"

Based on your first impression, which function do you think is the fastest, and which is the slowest?

The real result may surprise you:

import timeit

print(timeit.timeit(concat_plus, number=10000))
# 0.002738415962085128
print(timeit.timeit(concat_join, number=10000))
# 0.0008482920238748193
print(timeit.timeit(concat_directly, number=10000))
# 0.00021425005979835987

As shown above, for concatenating a list of strings, the?join()?method is faster than adding the strings one by one in a for loop.

The reason is straightforward. On one hand, strings are immutable data in Python, each?+=?operation results in the creation of a new string and the copying of the old string, which is computationally expensive.

On the other hand, the?.join()?method is specifically optimized for joining a sequence of strings. It precalculates the size of the resulting string and then builds it in one go. So it avoids the overhead associated with the?+=?operation in a loop, hence it's faster.

However, the fastest function in our testing is to concatenate string literals directly. Its high speed is due to:

  • The Python interpreter can optimize the concatenation of string literals at compile time, turning them into a single string literal. There are no loop iterations or function calls involved, making it a very efficient operation.
  • Since all the strings are known at compile time, Python can perform this operation very quickly, much faster than runtime concatenations in a loop or even the optimized?.join()?method.

In a word, if you need to concatenate a list of strings, choose?join()?over?+=. If you would like to concatenate strings directly, just use?+?to do it.

2. Faster List Creation: Use “[]” Over “list()”

Creating a list is not a big deal. Two common ways are:

  1. Use the?list()?function
  2. Use the?[]?directly

Let’s use a simple code snippet to test their performance:

import timeit

print(timeit.timeit('[]', number=10 ** 7))
# 0.1368238340364769
print(timeit.timeit(list, number=10 ** 7))
# 0.2958830420393497

As the result shows, executing the?list()?function is slower than using the?[]?directly.

It is because the?[]?is a literal syntax, while?list()?is a constructor call. Calling a function needs extra time without a doubt.

From the same logic, when creating a dictionary, we should also harness?{}?over?dict().

3. Faster Membership Testing: Use a Set Over a List

The performance of a membership checking operation heavily depends on the underlying data structures:

import timeit

large_dataset = range(100000)
search_element = 2077

large_list = list(large_dataset)
large_set = set(large_dataset)

def list_membership_test():
    return search_element in large_list

def set_membership_test():
    return search_element in large_set

print(timeit.timeit(list_membership_test, number=1000))
# 0.01112208398990333
print(timeit.timeit(set_membership_test, number=1000))
# 3.27499583363533e-05

As the above code demonstrates, membership testing in a set is much faster than in a list.

Why is it?

  • In Python lists, membership testing (element in list) is done by iterating over each element until the desired element is found or the end of the list is reached. Therefore, this operation has a?time complexity?of O(n).
  • Sets in Python are implemented as hash tables. When checking for membership (element in set), Python uses a hashing mechanism, whose time complexity, on average, is O(1).

The point here is to carefully consider the underlying data structure when writing programs. Harnessing the right data structure can speed up our code significantly.

4. Faster Data Generation: Use Comprehensions Over For Loops

There are?four types of comprehensions?in Python: list, dictionary, set, and generator. They not only provide a more concise syntax for creating relative data structures, but also have better performance than using for loops. Because they are optimized in Python's C implementation.

import timeit

def generate_squares_for_loop():
    squares = []
    for i in range(1000):
        squares.append(i * i)
    return squares

def generate_squares_comprehension():
    return [i * i for i in range(1000)]

print(timeit.timeit(generate_squares_for_loop, number=10000))
# 0.2797503340989351
print(timeit.timeit(generate_squares_comprehension, number=10000))
# 0.2364629579242319

The above code is a simple speed comparison between a list comprehension and a for loop. As the result shows, the?list comprehension?is faster.

5. Faster Loops: Prioritize Local Variables

In Python, accessing a local variable is faster than accessing a global one or an attribute of an object.

Here is an instance to prove this:

import timeit

class Example:
    def __init__(self):
        self.value = 0

obj = Example()

def test_dot_notation():
    for _ in range(1000):
        obj.value += 1

def test_local_variable():
    value = obj.value
    for _ in range(1000):
        value += 1
    obj.value = value

print(timeit.timeit(test_dot_notation, number=1000))
# 0.036605041939765215
print(timeit.timeit(test_local_variable, number=1000))
# 0.024470250005833805

This is how Python works. Intuitively, when a function is compiled, the local variables inside it are known, but other outside variables need time to be retrieved.

This is a minor deal but we can leverage it to optimize our code when handling a large size of data.

6. Faster Execution: Prioritize Built-In Modules and Libraries

When engineers say Python, it means CPython by default. Because CPython is the default and most widely used implementation of the Python language.

Given that most of its built-in modules and libraries are written in C, a faster and lower-level language, we should utilize the built-in arsenal and avoid reinventing the wheels.

import timeit
import random
from collections import Counter

def count_frequency_custom(lst):
    frequency = {}
    for item in lst:
        if item in frequency:
            frequency[item] += 1
            frequency[item] = 1
    return frequency

def count_frequency_builtin(lst):
    return Counter(lst)

large_list = [random.randint(0, 100) for _ in range(1000)]

print(timeit.timeit(lambda: count_frequency_custom(large_list), number=100))
# 0.005160166998393834
print(timeit.timeit(lambda: count_frequency_builtin(large_list), number=100))
# 0.002444291952997446

The above program compares two approaches to count element frequency in a list. As we can see, leveraging the built-in?Counter?from the?collections?module is faster, neater, and better than writing a for loop by ourselves.

7. Faster Function Calls: Leverage Cache Decorator for Easy Memoization

Caching is a commonly used technique to avoid repeated computations and speed up programs.

Fortunately, we don’t need to write our own caching processing code in most cases, since Python provides an out-of-box?decorator?for this purpose —?@functools.cache.

For instance, the following code will execute two Fibonacci number generation functions, one has a caching decorator but the other doesn’t:

import timeit
import functools

def fibonacci(n):
    if n in (0, 1):
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

def fibonacci_cached(n):
    if n in (0, 1):
        return n
    return fibonacci_cached(n - 1) + fibonacci_cached(n - 2)

# Test the execution time of each function
print(timeit.timeit(lambda: fibonacci(30), number=1))
# 0.09499712497927248
print(timeit.timeit(lambda: fibonacci_cached(30), number=1))
# 6.458023563027382e-06

The result proves how the?functools.cache?decorator makes our code faster.

The basic?fibonacci?function is inefficient because it recomputes the same Fibonacci numbers multiple times during the process of getting the result of?fibonacci(30).

The cached version is significantly faster since it caches the results of previous calculations. So, it only computes every Fibonacci number once, and subsequent calls with the same arguments are retrieved from the cache.

Merely adding a built-in decorator can make such a big improvement, this is what Pythonic means. 😎

8. Faster Infinite Loop: Prefer “while 1” Over “while True”

To make an infinite while loop, we can use?while True?or?while 1.

The difference in their performance is usually negligible. But it’s fun to know that?while 1?is slightly faster.

It stems from the fact that?1?is literal, but?True?is a global name that needs to be looked up in the global scope of Python, so a minuscule overhead is needed.

Let’s also check the real comparison of these two ways in a code snippet:

import timeit

def loop_with_true():
    i = 0
    while True:
        if i >= 1000:
        i += 1

def loop_with_one():
    i = 0
    while 1:
        if i >= 1000:
        i += 1

print(timeit.timeit(loop_with_true, number=10000))
# 0.1733035419601947
print(timeit.timeit(loop_with_one, number=10000))
# 0.16412191605195403

As we can see, the?while 1?is indeed slightly faster.

However, modern Python interpreters (like CPython) are highly optimized, and such differences are typically insignificant. So we don’t need to worry about this negligible difference. Not to mention that?while True?is more readable than?while 1.

9. Faster Start-Up: Import Python Modules Smartly

It seems natural to import all modules at the top of a Python script.

Actually, we don’t have to do that.

Furthermore, if a module is too large, importing it as needed is a better idea.

def my_function():
    import heavy_module
    # rest of the function

As the code above,?heavy_module?is imported inside a function. This is an idea of “lazy loading”, where the import is deferred until?my_function?is called.

The benefit of this approach is that if?my_function?is never called during the execution of our script, then?heavy_module?is never loaded, saving resources and reducing the startup time of our script.
