-
编译
Python 编译器将 py 文件编译成字节码(PyCodeObject)。
-
执行
Python 虚拟机执行字节码。
PyCodeObject 定义:
// Include/code.h
typedef struct {
PyObject_HEAD
int co_argcount; /* #arguments, except *args */
int co_nlocals; /* #local variables */
int co_stacksize; /* #entries needed for evaluation stack */
int co_flags; /* CO_..., see below */
PyObject *co_code; /* instruction opcodes */
PyObject *co_consts; /* list (constants used) */
PyObject *co_names; /* list of strings (names used) */
PyObject *co_varnames; /* tuple of strings (local variable names) */
PyObject *co_freevars; /* tuple of strings (free variable names) */
PyObject *co_cellvars; /* tuple of strings (cell variable names) */
/* The rest doesn't count for hash/cmp */
PyObject *co_filename; /* string (where it was loaded from) */
PyObject *co_name; /* string (name, for reference) */
int co_firstlineno; /* first source line number */
PyObject *co_lnotab; /* string (encoding addr<->lineno mapping) */
void *co_zombieframe; /* for optimization only (see frameobject.c) */
} PyCodeObject;Python 会对源代码中的每个代码块(Code Block)生成一个 PyCodeObject 对象,一个作用域(名字空间)就算做一个代码块。
代码块对应的字节码就存放在 PyCodeObject 的 co_code字段。
pyc文件包含三部分信息:
-
magic number
用于标识 pyc 版本,不同Python 版本的 magic number 不一样。通过 magic number 可以避免 Python 加载错误版本的 pyc 文件。
Python2.5 的 magic number 为:
#define MAGIC (62131 | ((long)'\r'<<16) | ((long)'\n'<<24)) /* Magic word as global; note that _PyImport_Init() can change the value of this global to accommodate for alterations of how the compiler works which are enabled by command line switches. */ static long pyc_magic = MAGIC;
-
mtime
文件修改时间,用于和 py 文件的修改时间进行对比。Python 在导入模块时优先导入pyc 文件,但是有可能 pyc 文件对应的 py 文件已经被修改了,所以 Python 需要对比 pyc 文件和 py 文件的修改时间,如果 py 文件的修改时间更新,则需要重新编译 pyc 文件。
-
字节码,即 PyCodeObject。
生成 pyc 文件的代码如下:
// Python/import.c
/* Write a compiled module to a file, placing the time of last
modification of its source into the header.
Errors are ignored, if a write error occurs an attempt is made to
remove the file. */
static void
write_compiled_module(PyCodeObject *co, char *cpathname, time_t mtime)
{
FILE *fp;
fp = open_exclusive(cpathname);
if (fp == NULL) {
if (Py_VerboseFlag)
PySys_WriteStderr(
"# can't create %s\n", cpathname);
return;
}
PyMarshal_WriteLongToFile(pyc_magic, fp, Py_MARSHAL_VERSION);
/* First write a 0 for mtime */
PyMarshal_WriteLongToFile(0L, fp, Py_MARSHAL_VERSION);
PyMarshal_WriteObjectToFile((PyObject *)co, fp, Py_MARSHAL_VERSION);
if (fflush(fp) != 0 || ferror(fp)) {
if (Py_VerboseFlag)
PySys_WriteStderr("# can't write %s\n", cpathname);
/* Don't keep partial file */
fclose(fp);
(void) unlink(cpathname);
return;
}
/* Now write the true mtime */
fseek(fp, 4L, 0);
assert(mtime < LONG_MAX);
PyMarshal_WriteLongToFile((long)mtime, fp, Py_MARSHAL_VERSION);
fflush(fp);
fclose(fp);
if (Py_VerboseFlag)
PySys_WriteStderr("# wrote %s\n", cpathname);
}PyMarshal_WriteObjectToFile最终调用 w_object将 PyCodeObject 写入文件中,w_object的代码如下(有删减):
// Python/marchal.c
// w_object 的代码很长,都是 if/else if 判断 object 的具体类型。
static void
w_object(PyObject *v, WFILE *p)
{
...
else if (PyCode_Check(v)) {
PyCodeObject *co = (PyCodeObject *)v;
w_byte(TYPE_CODE, p);
w_long(co->co_argcount, p);
w_long(co->co_nlocals, p);
w_long(co->co_stacksize, p);
w_long(co->co_flags, p);
w_object(co->co_code, p);
w_object(co->co_consts, p);
w_object(co->co_names, p);
w_object(co->co_varnames, p);
w_object(co->co_freevars, p);
w_object(co->co_cellvars, p);
w_object(co->co_filename, p);
w_object(co->co_name, p);
w_long(co->co_firstlineno, p);
w_object(co->co_lnotab, p);
}
...我自己写了一个 Python 脚本解析 pyc 文件,放在 这里
Python 标准库中的 dis 模块可以用来获取代码对应的字节码。