X Tutup
Skip to content

Commit e56cd36

Browse files
committed
update process and thread tutorials
1 parent 4ffcd35 commit e56cd36

File tree

4 files changed

+248
-0
lines changed

4 files changed

+248
-0
lines changed

13_thread/4.threadlocal.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
########################################ThreadLocal###################################
2+
3+
4+
#!/usr/bin/env python3
5+
# -*- coding: utf-8 -*-
6+
7+
import threading
8+
9+
# Create a global ThreadLocal object:
10+
local_school = threading.local() #※
11+
12+
def process_student():
13+
# Get the student associated with the current thread:
14+
std = local_school.student
15+
print('Hello, %s (in %s)' % (std, threading.current_thread().name))
16+
17+
def process_thread(name):
18+
# Bind the student of ThreadLocal:
19+
print('main thread...')
20+
local_school.student = name
21+
process_student()
22+
23+
t1 = threading.Thread(target= process_thread, args=('Alice',), name='Thread-A')
24+
t2 = threading.Thread(target= process_thread, args=('Bob',), name='Thread-B')
25+
t1.start()
26+
t2.start()
27+
t1.join()
28+
t2.join()
29+
30+
# The results are as follows:
31+
32+
# main thread...
33+
# Hello, Alice (in Thread-A)
34+
# main thread...
35+
# Hello, Bob (in Thread-B)
36+
37+
38+
# A global variable local_school is an ThreadLocal object, each Thread of which can read and write student
39+
# properties to it, but do not affect each other. You can think of it local_school as a global variable,
40+
# but each attribute local_school.studentis a local variable of the thread, which can be read and written arbitrarily
41+
# without interfering with each other, and there is no need to manage the lock problem, ThreadLocal
42+
# which will be handled internally.
43+
44+
# It can be understood as a global variable local_school, dict which can not only be used local_school.student,
45+
# but also bind other variables, such as local_school.teacher and so on.
46+
47+
# ThreadLocal
48+
# The most commonly used place is to bind a database connection, HTTP request, user identity information, etc.
49+
# for each thread, so that all the called processing functions of a thread can easily access these resources.
50+
51+
############################################ Summary#############################
52+
# Although a ThreadLocal variable is a global variable,
53+
# each thread can only read and write an independent copy of its own thread without interfering with each other.
54+
# ThreadLocal Solved the problem that parameters are passed to each other between functions in a thread.

13_thread/5.prothread.py

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
# ########################################Process vs Thread################################
2+
3+
# We introduced multiprocessing and multithreading, two of the most common ways to implement multitasking.
4+
# Now, let's discuss the pros and cons of both approaches.
5+
6+
# First of all, to achieve multi-tasking, we usually design the Master-Worker mode.
7+
# The Master is responsible for assigning tasks, and the Worker is responsible for executing tasks.
8+
# Therefore, in a multi-tasking environment, there is usually one Master and multiple Workers. (※)
9+
10+
# If the Master-Worker is implemented with multiple processes(※), the main process is the Master,
11+
# and the other processes are the Workers.
12+
13+
# If the Master-Worker is implemented with multiple threads(※), the main thread is the Master,
14+
# and the other threads are the Workers.
15+
16+
# The biggest advantage of multi-process mode is high stability,
17+
# because a child process crashes, it will not affect the main process and other child processes.
18+
# (Of course, the main process hangs up all the processes, but the Master process is only responsible for
19+
# allocating tasks, and the probability of hanging up is low.)
20+
# The famous Apache first adopted the multi-process mode.(※)
21+
22+
# The disadvantage of the multi-process mode is that the cost of creating a process is high.
23+
# In Unix/Linux systems, it fork is OK to use calls, and the cost of creating processes in Windows is huge.
24+
# In addition, the number of processes that the operating system can run at the same time is also limited.
25+
# Under the constraints of memory and CPU, if there are thousands of processes running at the same time,
26+
# the operating system will even have scheduling problems.
27+
28+
# Multi-threaded mode is usually a little faster than multi-process, but not much faster,
29+
# and the fatal disadvantage of multi-threaded mode is that any thread hangs may directly cause
30+
# the entire process to crash, because all threads share the memory of the process.
31+
# On Windows, if there is a problem with the code executed by a thread, you can often see this prompt:
32+
# "The program has performed an illegal operation and is about to close."
33+
# In fact, there is often a problem with a thread, but the operating system will force End the entire process.
34+
35+
# # Under Windows, multi-threading is more efficient than multi-process, so Microsoft's IIS server adopts
36+
# multi-threading mode by default. Due to the stability problem of multi-threading,
37+
# the stability of IIS is not as good as that of Apache. In order to alleviate this problem, IIS and Apache now
38+
# have a mixed mode of multi-process + multi-threading, which really complicates the problem.
39+
40+
# ########################################Thread Switching################################
41+
42+
# Whether it is multi-process or multi-threaded,
43+
# as long as the number is large, the efficiency will definitely not go up, why?
44+
45+
# Let's take an analogy.
46+
# Suppose you are unfortunately preparing for the senior high school entrance examination.
47+
# You need to do homework in 5 subjects of Chinese, mathematics, English, physics, and chemistry every night.
48+
# Each homework takes 1 hour.
49+
# If you spend 1 hour doing the language homework first, and then spend 1 hour doing the math homework,
50+
# and then do it all in turn, it will take a total of 5 hours. This method is called a single-task model,
51+
# or a batch task model.
52+
53+
# Suppose you plan to switch to a multitasking model, you can do Chinese for 1 minute,
54+
# then switch to math homework, do 1 minute, then switch to English, and so on, as long as the switching
55+
# speed is fast enough, this method will be executed with a single-core CPU. Multitasking is the same.
56+
# From the point of view of a kindergartener, you are doing 5 homework at the same time.
57+
58+
# However, switching homework comes at a cost. For example, when switching from Chinese to mathematics,
59+
# you must first clean up the Chinese books and pens on the desk (this is called saving the scene),
60+
# then, open the mathematics textbook and find a compass ruler (this is called preparing for a new environment) )
61+
# to start doing math homework. The same is true when the operating system switches processes or threads.
62+
# It needs to save the currently executed on-site environment (CPU register state, memory page, etc.),
63+
# and then prepare the execution environment of the new task (restore the last register state, switch memory
64+
# pages, etc.) to start execution. Although this switching process is fast, it also takes time.
65+
# If there are thousands of tasks running at the same time, the operating system may be mainly busy switching
66+
# tasks, and there is not much time to perform tasks.
67+
68+
# Therefore, once the multitasking reaches a limit, it will consume all the resources of the system, resulting in
69+
# a sharp drop in efficiency, and all tasks cannot be done well.
70+
71+
72+
# ##################################Compute-intensive vs. IO-intensive#######################
73+
74+
# A second consideration for multitasking is the type of task.
75+
# We can divide tasks into compute-intensive and IO-intensive.
76+
77+
# Computation-intensive tasks are characterized by a large amount of computation and CPU resource consumption,
78+
# such as calculating the pi ratio, decoding video in high-definition, etc., all of which depend on the computing
79+
# power of the CPU. Although this kind of computing-intensive task can also be completed by multitasking,
80+
# the more tasks, the more time spent in task switching, and the lower the efficiency of the CPU to perform tasks.
81+
# The number of simultaneous tasks should be equal to the number of CPU cores.
82+
83+
# Computation-intensive tasks mainly consume CPU resources, so the efficiency of the code is very important.
84+
# Scripting languages ​​like Python are inefficient and completely unsuitable for computationally intensive tasks.
85+
# For computationally intensive tasks, it is best to write in C.
86+
87+
# The second type of task is IO-intensive. Tasks involving network and disk IO are all IO-intensive tasks.
88+
# This type of task is characterized by low CPU consumption and most of the time of the task is waiting for the
89+
# IO operation to complete (because The speed of IO is much lower than the speed of CPU and memory).
90+
# For IO-intensive tasks, the more tasks, the higher the CPU efficiency, but there is a limit. Most common tasks
91+
# are IO-intensive tasks, such as web applications.
92+
93+
# During the execution of IO-intensive tasks, 99% of the time is spent on IO, and very little time is spent
94+
# on the CPU. Therefore, it is completely impossible to replace the extremely slow scripting language such as
95+
# Python with the extremely fast C language. Improve operational efficiency. For IO-intensive tasks, the most
96+
# suitable language is the language with the highest development efficiency (the least amount of code),
97+
# the scripting language is the first choice, and the C language is the worst.
98+
99+
100+
# ##################################Asynchronous IO#######################
101+
# Considering the huge speed difference between CPU and IO, a task spends most of the time waiting for IO
102+
# operations during execution.
103+
# The single-process single-threaded model will prevent other tasks from being executed in parallel.
104+
# Therefore, we need a multi-process model. Or a multithreading model to support concurrent execution of
105+
# multiple tasks.
106+
107+
# Modern operating systems have made huge improvements to IO operations, and the biggest feature is
108+
# support for asynchronous IO.
109+
# If you make full use of the asynchronous IO support provided by the operating system, you can use
110+
# a single-process single-thread model to perform multitasking. This new model is called an event-driven model.
111+
# Nginx is a web server that supports asynchronous IO. It runs on a single-core CPU. The single-process model
112+
# can efficiently support multitasking. On a multi-core CPU, you can run multiple processes (the same number
113+
# as the number of CPU cores), taking full advantage of the multi-core CPU. Because the total number of processes
114+
# in the system is very limited, the operating system scheduling is very efficient. Multitasking with the asynchronous
115+
# IO programming model is a major trend.
116+
# Corresponding to the Python language, the single-threaded asynchronous programming model is called coroutines.
117+
# With the support of coroutines, efficient multitasking programs can be written based on event driving.
118+
# We'll discuss how to write coroutines later.

13_thread/6.master.py

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
########################################Master###################################
2+
#!/usr/bin/env python3
3+
# -*- coding: utf-8 -*-
4+
5+
import random, time, queue
6+
from multiprocessing.managers import BaseManager
7+
8+
# Queue for sending tasks:
9+
task_queue = queue.Queue()
10+
# Queue to receive results:
11+
result_queue = queue.Queue()
12+
13+
# QueueManager inherited from BaseManager:
14+
class QueueManager(BaseManager):
15+
pass
16+
17+
# Register both Queues on the network, and the callable parameter is associated with the Queue object:
18+
QueueManager.register('get_task_queue', callable=lambda: task_queue)
19+
QueueManager.register('get_result_queue', callable=lambda: result_queue)
20+
# Bind port 5000, set verification code 'abc':
21+
manager = QueueManager(address=('', 5000), authkey=b'abc')
22+
# Start Queue:
23+
manager.start()
24+
# Get a Queue object accessed over the network:
25+
task = manager.get_task_queue()
26+
result = manager.get_result_queue()
27+
#Put a few tasks in it:
28+
for i in range(10):
29+
n = random.randint(0, 10000)
30+
print('Put task %d...' % n)
31+
task.put(n)
32+
# Read the result from the result queue:
33+
print('Try get results...')
34+
for i in range(10):
35+
r = result.get(timeout=10)
36+
print('Result: %s' % r)
37+
# Close:
38+
manager.shutdown()
39+
print('master exit.')

13_thread/7.worker.py

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
########################################Worker###################################
2+
#!/usr/bin/env python3
3+
# -*- coding: utf-8 -*-
4+
5+
import time, sys, queue
6+
from multiprocessing.managers import BaseManager
7+
8+
# Create a similar QueueManager:
9+
class QueueManager(BaseManager):
10+
pass
11+
12+
# Since this QueueManager only gets the Queue from the network, only the name is provided when registering:
13+
QueueManager.register('get_task_queue')
14+
QueueManager.register('get_result_queue')
15+
16+
# Connect to the server, which is the machine running task_master.py:
17+
server_addr = '127.0.0.1'
18+
print('Connect to server %s...' % server_addr)
19+
# Note that the port and verification code should be exactly the same as those set in task_master.py:
20+
m = QueueManager(address=(server_addr, 5000), authkey=b'abc')
21+
# Connecting from the Internet:
22+
m.connect()
23+
# Get the object of Queue:
24+
task = m.get_task_queue()
25+
result = m.get_result_queue()
26+
# Get the task from the task queue and write the result to the result queue:
27+
for i in range(10):
28+
try:
29+
n = task.get(timeout=1)
30+
print('run task %d * %d...' % (n, n))
31+
r = '%d * %d = %d' % (n, n, n*n)
32+
time.sleep(1)
33+
result.put(r)
34+
except Queue.Empty:
35+
print('task queue is empty.')
36+
# End of processing:
37+
print('worker exit.')

0 commit comments

Comments
 (0)
X Tutup