X Tutup
Skip to content

Latest commit

 

History

History
344 lines (256 loc) · 10.4 KB

File metadata and controls

344 lines (256 loc) · 10.4 KB

Python 运行环境初始化

初始化的工作主要由 Py_initializeEx(pythonrun.c) 完成。

下面分别对 Py_initializeEx 中的关键步骤进行分析。

线程环境初始化

在 Py_initializeEx 中对应的代码:

- PyInterpreterState_New

- PyThreadState_New

- PyThreadState_Swap(设置_PyThreadState_Current)
//pystate.c

typedef struct _is {

    struct _is *next;
    struct _ts *tstate_head;

    PyObject *modules;
    PyObject *sysdict;
    PyObject *builtins;

    PyObject *codec_search_path;
    PyObject *codec_search_cache;
    PyObject *codec_error_registry;

#ifdef HAVE_DLOPEN
    int dlopenflags;
#endif
#ifdef WITH_TSC
    int tscdump;
#endif

} PyInterpreterState;

typedef struct _ts {
    /* See Python/ceval.c for comments explaining most fields */

    struct _ts *next;
    PyInterpreterState *interp;

    struct _frame *frame;
    int recursion_depth;
    /* 'tracing' keeps track of the execution depth when tracing/profiling.
       This is to prevent the actual trace/profile code from being recorded in
       the trace/profile. */
    int tracing;
    int use_tracing;

    Py_tracefunc c_profilefunc;
    Py_tracefunc c_tracefunc;
    PyObject *c_profileobj;
    PyObject *c_traceobj;

    PyObject *curexc_type;
    PyObject *curexc_value;
    PyObject *curexc_traceback;

    PyObject *exc_type;
    PyObject *exc_value;
    PyObject *exc_traceback;

    PyObject *dict;  /* Stores per-thread state */

    /* tick_counter is incremented whenever the check_interval ticker
     * reaches zero. The purpose is to give a useful measure of the number
     * of interpreted bytecode instructions in a given thread.  This
     * extremely lightweight statistic collector may be of interest to
     * profilers (like psyco.jit()), although nothing in the core uses it.
     */
    int tick_counter;

    int gilstate_counter;

    PyObject *async_exc; /* Asynchronous exception to raise */
    long thread_id; /* Thread id where this tstate was created */

    /* XXX signal handlers should also be here */

} PyThreadState;

PyInterpreterState 模拟操作系统中的进程环境,PyThreadState 模拟线程环境。

PyInterpreterState 结构体中包含 next 字段,由此可以看出,在 Python 中可能存在一个由 PyInterpreterState 组成的链表,链表的 head 保存在 interp_head 中。PyThreadState 同理。

Python 线程环境模型:

                   +--------------------+             +----------------+      +----------------+
                   |                    | tstate_head |                | next |                |
interp_head +------> PyInterpreterState +------------->  PyThreadState +------>  PyThreadState +----> NULL
                   |                    |             |                |      |                |
                   +---------^----------+             +-------+--------+      +--------+-------+
                             |            interp              |          interp        |
                             +--------------------------------+------------------------+

类型系统初始化

在 Py_initializeEx 中对应的代码:

- _Py_ReadyTypes(调用 PyType_Ready 初始化各种内置类型)

- _PyInt_Init

- _PyFloat_Init

初始化內建的各种类型,如 int,str,list,dict。

系统 module 初始化

interp->modules = PyDict_New();

创建__builtin__ module

bimod = _PyBuiltin_Init();
if (bimod == NULL)
    Py_FatalError("Py_Initialize: can't initialize __builtin__");
interp->builtins = PyModule_GetDict(bimod);

_PyBuiltin_Init 的调用流程:_PyBuiltin_Init -> Py_InitModule4 -> PyImport_AddModule -> PyModule_New

_PyBuiltin_Init 返回一个 PyModuleObject,其定义如下:

typedef struct {
	PyObject_HEAD
	PyObject *md_dict;
} PyModuleObject;

可以看到 PyModuleObject 最关键的就是 md_dict。

Py_InitModule4 负责初始化一个 module,往 module 的 md_dict 中添加 methods。

PyObject * Py_InitModule4(const char *name, PyMethodDef *methods, const char *doc, PyObject *passthrough, int module_api_version)

_PyBuiltin_Init 在调用 Py_InitModule4 获得__builtin__模块之后,往模块的 dict 中添加基本类型和一些关键字,例如:

  • int, long, float, str, dict, list, set

  • None, True, False, bool

  • range, xrange, type, object

创建 sys module

sysmod = _PySys_Init();
if (sysmod == NULL)
    Py_FatalError("Py_Initialize: can't initialize sys");
interp->sysdict = PyModule_GetDict(sysmod);

_PySys_Init 和_PyBuiltin_Init 类似,首先初始化模块,然后往模块的 dict 中添加许多属性,例如:

  • stdin, stdout, stderr

  • version, hexversion, version_info, api_version

  • copyright, platform

  • executable, prefix, exec_prefix,

  • maxint, maxunicode

  • builtin_module_names

缓存 sys 模块

_PyImport_FixupExtension("sys", "sys");

设置 module 搜索路径 sys.path

PySys_SetPath(Py_GetPath());

下面的注释摘自getpath.c,介绍了 Python 如何获取 prefix、exec_prefix、sys.path。

/* Search in some common locations for the associated Python libraries.
 *
 * Two directories must be found, the platform independent directory
 * (prefix), containing the common .py and .pyc files, and the platform
 * dependent directory (exec_prefix), containing the shared library
 * modules.  Note that prefix and exec_prefix can be the same directory,
 * but for some installations, they are different.
 *
 * Py_GetPath() carries out separate searches for prefix and exec_prefix.
 * Each search tries a number of different locations until a ``landmark''
 * file or directory is found.  If no prefix or exec_prefix is found, a
 * warning message is issued and the preprocessor defined PREFIX and
 * EXEC_PREFIX are used (even though they will not work); python carries on
 * as best as is possible, but most imports will fail.
 *
 * Before any searches are done, the location of the executable is
 * determined.  If argv[0] has one or more slashs in it, it is used
 * unchanged.  Otherwise, it must have been invoked from the shell's path,
 * so we search $PATH for the named executable and use that.  If the
 * executable was not found on $PATH (or there was no $PATH environment
 * variable), the original argv[0] string is used.
 *
 * Next, the executable location is examined to see if it is a symbolic
 * link.  If so, the link is chased (correctly interpreting a relative
 * pathname if one is found) and the directory of the link target is used.
 *
 * Finally, argv0_path is set to the directory containing the executable
 * (i.e. the last component is stripped).
 *
 * With argv0_path in hand, we perform a number of steps.  The same steps
 * are performed for prefix and for exec_prefix, but with a different
 * landmark.
 *
 * Step 1. Are we running python out of the build directory?  This is
 * checked by looking for a different kind of landmark relative to
 * argv0_path.  For prefix, the landmark's path is derived from the VPATH
 * preprocessor variable (taking into account that its value is almost, but
 * not quite, what we need).  For exec_prefix, the landmark is
 * Modules/Setup.  If the landmark is found, we're done.
 *
 * For the remaining steps, the prefix landmark will always be
 * lib/python$VERSION/os.py and the exec_prefix will always be
 * lib/python$VERSION/lib-dynload, where $VERSION is Python's version
 * number as supplied by the Makefile.  Note that this means that no more
 * build directory checking is performed; if the first step did not find
 * the landmarks, the assumption is that python is running from an
 * installed setup.
 *
 * Step 2. See if the $PYTHONHOME environment variable points to the
 * installed location of the Python libraries.  If $PYTHONHOME is set, then
 * it points to prefix and exec_prefix.  $PYTHONHOME can be a single
 * directory, which is used for both, or the prefix and exec_prefix
 * directories separated by a colon.
 *
 * Step 3. Try to find prefix and exec_prefix relative to argv0_path,
 * backtracking up the path until it is exhausted.  This is the most common
 * step to succeed.  Note that if prefix and exec_prefix are different,
 * exec_prefix is more likely to be found; however if exec_prefix is a
 * subdirectory of prefix, both will be found.
 *
 * Step 4. Search the directories pointed to by the preprocessor variables
 * PREFIX and EXEC_PREFIX.  These are supplied by the Makefile but can be
 * passed in as options to the configure script.
 *
 * That's it!
 *
 * Well, almost.  Once we have determined prefix and exec_prefix, the
 * preprocessor variable PYTHONPATH is used to construct a path.  Each
 * relative path on PYTHONPATH is prefixed with prefix.  Then the directory
 * containing the shared library modules is appended.  The environment
 * variable $PYTHONPATH is inserted in front of it all.  Finally, the
 * prefix and exec_prefix globals are tweaked so they reflect the values
 * expected by other code, by stripping the "lib/python$VERSION/..." stuff
 * off.  If either points to the build directory, the globals are reset to
 * the corresponding preprocessor variables (so sys.prefix will reflect the
 * installation location, even though sys.path points into the build
 * directory).  This seems to make more sense given that currently the only
 * known use of sys.prefix and sys.exec_prefix is for the ILU installation
 * process to find the installed Python tree.
 */

初始化 ImportTab

_PyImport_Init();

初始化异常

_PyExc_Init();

创建 exceptions 模块,并将各种 Exception 添加到__builtin__中

初始化 import hooks

_PyImportHooks_Init()

创建 main module

initmain();

创建 main 模块,并将模块属性__builtins__设置为__builtin__模块

设置site-specific的module的搜索路径

initsite();

设置 stdin、stdout、stderr 的编码

激活 Python 虚拟机

Py_Main函数时 Python 的主程序入口,在完成初始化工作之后,开始执行 Python脚本(或者启动 REPL)。

if (command) {
    sts = PyRun_SimpleStringFlags(command, &cf) != 0;
    free(command);
} else if (module) {
    sts = RunModule(module);
    free(module);
}
else {
    if (filename == NULL && stdin_is_interactive) {
        RunStartupFile(&cf);
    }
    /* XXX */
    sts = PyRun_AnyFileExFlags(
        fp,
        filename == NULL ? "<stdin>" : filename,
        filename != NULL, &cf) != 0;
}
X Tutup