X Tutup
Skip to content

Latest commit

 

History

History
155 lines (119 loc) · 4.63 KB

File metadata and controls

155 lines (119 loc) · 4.63 KB

Python 的编译结果——Code 对象与 pyc 文件

Python 脚本的执行过程

  • 编译

    Python 编译器将 py 文件编译成字节码(PyCodeObject)。

  • 执行

    Python 虚拟机执行字节码。

PyCodeObject

PyCodeObject 定义:

// Include/code.h

typedef struct {
    PyObject_HEAD
    int co_argcount;		/* #arguments, except *args */
    int co_nlocals;		/* #local variables */
    int co_stacksize;		/* #entries needed for evaluation stack */
    int co_flags;		/* CO_..., see below */
    PyObject *co_code;		/* instruction opcodes */
    PyObject *co_consts;	/* list (constants used) */
    PyObject *co_names;		/* list of strings (names used) */
    PyObject *co_varnames;	/* tuple of strings (local variable names) */
    PyObject *co_freevars;	/* tuple of strings (free variable names) */
    PyObject *co_cellvars;      /* tuple of strings (cell variable names) */
    /* The rest doesn't count for hash/cmp */
    PyObject *co_filename;	/* string (where it was loaded from) */
    PyObject *co_name;		/* string (name, for reference) */
    int co_firstlineno;		/* first source line number */
    PyObject *co_lnotab;	/* string (encoding addr<->lineno mapping) */
    void *co_zombieframe;     /* for optimization only (see frameobject.c) */
} PyCodeObject;

Python 会对源代码中的每个代码块(Code Block)生成一个 PyCodeObject 对象,一个作用域(名字空间)就算做一个代码块。

代码块对应的字节码就存放在 PyCodeObject 的 co_code字段。

生成 pyc 文件

pyc文件包含三部分信息:

  • magic number

    用于标识 pyc 版本,不同Python 版本的 magic number 不一样。通过 magic number 可以避免 Python 加载错误版本的 pyc 文件。

    Python2.5 的 magic number 为:

    #define MAGIC (62131 | ((long)'\r'<<16) | ((long)'\n'<<24))
    
    /* Magic word as global; note that _PyImport_Init() can change the
    value of this global to accommodate for alterations of how the
    compiler works which are enabled by command line switches. */
    static long pyc_magic = MAGIC;
  • mtime

    文件修改时间,用于和 py 文件的修改时间进行对比。Python 在导入模块时优先导入pyc 文件,但是有可能 pyc 文件对应的 py 文件已经被修改了,所以 Python 需要对比 pyc 文件和 py 文件的修改时间,如果 py 文件的修改时间更新,则需要重新编译 pyc 文件。

  • 字节码,即 PyCodeObject。

生成 pyc 文件的代码如下:

// Python/import.c

/* Write a compiled module to a file, placing the time of last
   modification of its source into the header.
   Errors are ignored, if a write error occurs an attempt is made to
   remove the file. */

static void
write_compiled_module(PyCodeObject *co, char *cpathname, time_t mtime)
{
	FILE *fp;

	fp = open_exclusive(cpathname);
	if (fp == NULL) {
		if (Py_VerboseFlag)
			PySys_WriteStderr(
				"# can't create %s\n", cpathname);
		return;
	}
	PyMarshal_WriteLongToFile(pyc_magic, fp, Py_MARSHAL_VERSION);
	/* First write a 0 for mtime */
	PyMarshal_WriteLongToFile(0L, fp, Py_MARSHAL_VERSION);
	PyMarshal_WriteObjectToFile((PyObject *)co, fp, Py_MARSHAL_VERSION);
	if (fflush(fp) != 0 || ferror(fp)) {
		if (Py_VerboseFlag)
			PySys_WriteStderr("# can't write %s\n", cpathname);
		/* Don't keep partial file */
		fclose(fp);
		(void) unlink(cpathname);
		return;
	}
	/* Now write the true mtime */
	fseek(fp, 4L, 0);
	assert(mtime < LONG_MAX);
	PyMarshal_WriteLongToFile((long)mtime, fp, Py_MARSHAL_VERSION);
	fflush(fp);
	fclose(fp);
	if (Py_VerboseFlag)
		PySys_WriteStderr("# wrote %s\n", cpathname);
}

PyMarshal_WriteObjectToFile最终调用 w_object将 PyCodeObject 写入文件中,w_object的代码如下(有删减):

// Python/marchal.c

// w_object 的代码很长,都是 if/else if 判断 object 的具体类型。

static void
w_object(PyObject *v, WFILE *p)
{
    ...
    else if (PyCode_Check(v)) {
		PyCodeObject *co = (PyCodeObject *)v;
		w_byte(TYPE_CODE, p);
		w_long(co->co_argcount, p);
		w_long(co->co_nlocals, p);
		w_long(co->co_stacksize, p);
		w_long(co->co_flags, p);
		w_object(co->co_code, p);
		w_object(co->co_consts, p);
		w_object(co->co_names, p);
		w_object(co->co_varnames, p);
		w_object(co->co_freevars, p);
		w_object(co->co_cellvars, p);
		w_object(co->co_filename, p);
		w_object(co->co_name, p);
		w_long(co->co_firstlineno, p);
		w_object(co->co_lnotab, p);
	}
    ...

解析 pyc 文件

我自己写了一个 Python 脚本解析 pyc 文件,放在 这里

dis

Python 标准库中的 dis 模块可以用来获取代码对应的字节码。

X Tutup