初始化的工作主要由 Py_initializeEx(pythonrun.c) 完成。
下面分别对 Py_initializeEx 中的关键步骤进行分析。
在 Py_initializeEx 中对应的代码:
- PyInterpreterState_New
- PyThreadState_New
- PyThreadState_Swap(设置_PyThreadState_Current)
//pystate.c
typedef struct _is {
struct _is *next;
struct _ts *tstate_head;
PyObject *modules;
PyObject *sysdict;
PyObject *builtins;
PyObject *codec_search_path;
PyObject *codec_search_cache;
PyObject *codec_error_registry;
#ifdef HAVE_DLOPEN
int dlopenflags;
#endif
#ifdef WITH_TSC
int tscdump;
#endif
} PyInterpreterState;
typedef struct _ts {
/* See Python/ceval.c for comments explaining most fields */
struct _ts *next;
PyInterpreterState *interp;
struct _frame *frame;
int recursion_depth;
/* 'tracing' keeps track of the execution depth when tracing/profiling.
This is to prevent the actual trace/profile code from being recorded in
the trace/profile. */
int tracing;
int use_tracing;
Py_tracefunc c_profilefunc;
Py_tracefunc c_tracefunc;
PyObject *c_profileobj;
PyObject *c_traceobj;
PyObject *curexc_type;
PyObject *curexc_value;
PyObject *curexc_traceback;
PyObject *exc_type;
PyObject *exc_value;
PyObject *exc_traceback;
PyObject *dict; /* Stores per-thread state */
/* tick_counter is incremented whenever the check_interval ticker
* reaches zero. The purpose is to give a useful measure of the number
* of interpreted bytecode instructions in a given thread. This
* extremely lightweight statistic collector may be of interest to
* profilers (like psyco.jit()), although nothing in the core uses it.
*/
int tick_counter;
int gilstate_counter;
PyObject *async_exc; /* Asynchronous exception to raise */
long thread_id; /* Thread id where this tstate was created */
/* XXX signal handlers should also be here */
} PyThreadState;PyInterpreterState 模拟操作系统中的进程环境,PyThreadState 模拟线程环境。
PyInterpreterState 结构体中包含 next 字段,由此可以看出,在 Python 中可能存在一个由 PyInterpreterState 组成的链表,链表的 head 保存在 interp_head 中。PyThreadState 同理。
Python 线程环境模型:
+--------------------+ +----------------+ +----------------+
| | tstate_head | | next | |
interp_head +------> PyInterpreterState +-------------> PyThreadState +------> PyThreadState +----> NULL
| | | | | |
+---------^----------+ +-------+--------+ +--------+-------+
| interp | interp |
+--------------------------------+------------------------+
在 Py_initializeEx 中对应的代码:
- _Py_ReadyTypes(调用 PyType_Ready 初始化各种内置类型)
- _PyInt_Init
- _PyFloat_Init
初始化內建的各种类型,如 int,str,list,dict。
interp->modules = PyDict_New();bimod = _PyBuiltin_Init();
if (bimod == NULL)
Py_FatalError("Py_Initialize: can't initialize __builtin__");
interp->builtins = PyModule_GetDict(bimod);_PyBuiltin_Init 的调用流程:_PyBuiltin_Init -> Py_InitModule4 -> PyImport_AddModule -> PyModule_New
_PyBuiltin_Init 返回一个 PyModuleObject,其定义如下:
typedef struct {
PyObject_HEAD
PyObject *md_dict;
} PyModuleObject;可以看到 PyModuleObject 最关键的就是 md_dict。
Py_InitModule4 负责初始化一个 module,往 module 的 md_dict 中添加 methods。
PyObject * Py_InitModule4(const char *name, PyMethodDef *methods, const char *doc, PyObject *passthrough, int module_api_version)_PyBuiltin_Init 在调用 Py_InitModule4 获得__builtin__模块之后,往模块的 dict 中添加基本类型和一些关键字,例如:
-
int, long, float, str, dict, list, set
-
None, True, False, bool
-
range, xrange, type, object
sysmod = _PySys_Init();
if (sysmod == NULL)
Py_FatalError("Py_Initialize: can't initialize sys");
interp->sysdict = PyModule_GetDict(sysmod);_PySys_Init 和_PyBuiltin_Init 类似,首先初始化模块,然后往模块的 dict 中添加许多属性,例如:
-
stdin, stdout, stderr
-
version, hexversion, version_info, api_version
-
copyright, platform
-
executable, prefix, exec_prefix,
-
maxint, maxunicode
-
builtin_module_names
_PyImport_FixupExtension("sys", "sys");PySys_SetPath(Py_GetPath());下面的注释摘自getpath.c,介绍了 Python 如何获取 prefix、exec_prefix、sys.path。
/* Search in some common locations for the associated Python libraries.
*
* Two directories must be found, the platform independent directory
* (prefix), containing the common .py and .pyc files, and the platform
* dependent directory (exec_prefix), containing the shared library
* modules. Note that prefix and exec_prefix can be the same directory,
* but for some installations, they are different.
*
* Py_GetPath() carries out separate searches for prefix and exec_prefix.
* Each search tries a number of different locations until a ``landmark''
* file or directory is found. If no prefix or exec_prefix is found, a
* warning message is issued and the preprocessor defined PREFIX and
* EXEC_PREFIX are used (even though they will not work); python carries on
* as best as is possible, but most imports will fail.
*
* Before any searches are done, the location of the executable is
* determined. If argv[0] has one or more slashs in it, it is used
* unchanged. Otherwise, it must have been invoked from the shell's path,
* so we search $PATH for the named executable and use that. If the
* executable was not found on $PATH (or there was no $PATH environment
* variable), the original argv[0] string is used.
*
* Next, the executable location is examined to see if it is a symbolic
* link. If so, the link is chased (correctly interpreting a relative
* pathname if one is found) and the directory of the link target is used.
*
* Finally, argv0_path is set to the directory containing the executable
* (i.e. the last component is stripped).
*
* With argv0_path in hand, we perform a number of steps. The same steps
* are performed for prefix and for exec_prefix, but with a different
* landmark.
*
* Step 1. Are we running python out of the build directory? This is
* checked by looking for a different kind of landmark relative to
* argv0_path. For prefix, the landmark's path is derived from the VPATH
* preprocessor variable (taking into account that its value is almost, but
* not quite, what we need). For exec_prefix, the landmark is
* Modules/Setup. If the landmark is found, we're done.
*
* For the remaining steps, the prefix landmark will always be
* lib/python$VERSION/os.py and the exec_prefix will always be
* lib/python$VERSION/lib-dynload, where $VERSION is Python's version
* number as supplied by the Makefile. Note that this means that no more
* build directory checking is performed; if the first step did not find
* the landmarks, the assumption is that python is running from an
* installed setup.
*
* Step 2. See if the $PYTHONHOME environment variable points to the
* installed location of the Python libraries. If $PYTHONHOME is set, then
* it points to prefix and exec_prefix. $PYTHONHOME can be a single
* directory, which is used for both, or the prefix and exec_prefix
* directories separated by a colon.
*
* Step 3. Try to find prefix and exec_prefix relative to argv0_path,
* backtracking up the path until it is exhausted. This is the most common
* step to succeed. Note that if prefix and exec_prefix are different,
* exec_prefix is more likely to be found; however if exec_prefix is a
* subdirectory of prefix, both will be found.
*
* Step 4. Search the directories pointed to by the preprocessor variables
* PREFIX and EXEC_PREFIX. These are supplied by the Makefile but can be
* passed in as options to the configure script.
*
* That's it!
*
* Well, almost. Once we have determined prefix and exec_prefix, the
* preprocessor variable PYTHONPATH is used to construct a path. Each
* relative path on PYTHONPATH is prefixed with prefix. Then the directory
* containing the shared library modules is appended. The environment
* variable $PYTHONPATH is inserted in front of it all. Finally, the
* prefix and exec_prefix globals are tweaked so they reflect the values
* expected by other code, by stripping the "lib/python$VERSION/..." stuff
* off. If either points to the build directory, the globals are reset to
* the corresponding preprocessor variables (so sys.prefix will reflect the
* installation location, even though sys.path points into the build
* directory). This seems to make more sense given that currently the only
* known use of sys.prefix and sys.exec_prefix is for the ILU installation
* process to find the installed Python tree.
*/
_PyImport_Init();_PyExc_Init();创建 exceptions 模块,并将各种 Exception 添加到__builtin__中
_PyImportHooks_Init()initmain();创建 main 模块,并将模块属性__builtins__设置为__builtin__模块
initsite();Py_Main函数时 Python 的主程序入口,在完成初始化工作之后,开始执行 Python脚本(或者启动 REPL)。
if (command) {
sts = PyRun_SimpleStringFlags(command, &cf) != 0;
free(command);
} else if (module) {
sts = RunModule(module);
free(module);
}
else {
if (filename == NULL && stdin_is_interactive) {
RunStartupFile(&cf);
}
/* XXX */
sts = PyRun_AnyFileExFlags(
fp,
filename == NULL ? "<stdin>" : filename,
filename != NULL, &cf) != 0;
}