楔子
我们知道对象被创建,主要有两种方式,一种是通过Python/C API,另一种是通过调用类型对象。对于内置类型的实例对象而言,这两种方式都是支持的,比如列表,我们即可以通过[]创建,也可以通过list(),前者是Python/C API,后者是调用类型对象。
但对于自定义类的实例对象而言,我们只能通过调用类型对象的方式来创建。而一个对象如果可以被调用,那么这个对象就是callable,否则就不是callable。
而决定一个对象是不是callable,就取决于其对应的类型对象中是否定义了某个方法。如果从 Python 的角度看的话,这个方法就是 __call__,从解释器角度看的话,这个方法就是 tp_call。
从 Python 的角度看对象的调用
调用 int、str、tuple 可以创建一个整数、字符串、元组,调用自定义的类也可以创建出相应的实例对象,说明类型对象是可调用的,也就是callable。那么这些类型对象(int、str、tuple、class等等)的类型对象(type)内部一定有 __call__ 方法。
<span class="cm-comment"># int可以调用</span>
<span class="cm-comment"># 那么它的类型对象、也就是元类(type), 内部一定有__call__方法</span>
<span class="cm-builtin">print</span>(<span class="cm-builtin">hasattr</span>(<span class="cm-builtin">type</span>, <span class="cm-string">"__call__"</span>)) <span class="cm-comment"># True</span>
<span class="cm-comment"># 而调用一个对象,等价于调用其类型对象的 __call__ 方法</span>
<span class="cm-comment"># 所以 int(3.14)实际就等价于如下</span>
<span class="cm-builtin">print</span>(<span class="cm-builtin">type</span>.<span class="cm-property">__call__</span>(<span class="cm-builtin">int</span>, <span class="cm-number">3.14</span>)) <span class="cm-comment"># 3</span>
注意:这里描述的可能有一些绕,我们说 int、str、float 这些都是类型对象(简单来说就是类),而 123、"你好"、3.14 是其对应的实例对象,这些都没问题。但type是不是类型对象,显然是的,虽然我们称呼它为元类,但它也是类型对象,如果 print(type) 显示的也是一个类。
那么相对 type 而言,int、str、float 是不是又成了实例对象呢?因为它们的类型是 type。
所以 class 具有二象性:
- 如果站在实例对象(如:123、"satori"、[]、3.14)的角度上,它是类型对象
- 如果站在 type 的角度上,它是实例对象
同理 type 的类型是也是 type,那么 type 既是 type 的类型对象,type 也是 type 的实例对象。虽然这里描述的会有一些绕,但应该不难理解,并且为了避免后续的描述出现歧义,这里我们做一个申明:
- 整数、浮点数、字符串等等,我们称之为实例对象
- int、float、str、dict,以及我们自定义的类,我们称之为类型对象
- type 虽然也是类型对象,但我们称它为元类
所以 type 的内部有 __call__ 方法,那么说明类型对象都是可调用的,因为调用类型对象就是调用 type 的 __call__ 方法。而实例对象能否调用就不一定了,这取决于它的类型对象中是否定义了 __call__ 方法,因为调用一个对象,本质上是执行其类型对象内部的 __call__ 方法。
<span class="cm-keyword">class</span> <span class="cm-def">A</span>:
<span class="cm-keyword">pass</span>
<span class="cm-variable">a</span> <span class="cm-operator">=</span> <span class="cm-variable">A</span>()
<span class="cm-comment"># 因为我们自定义的类 A 里面没有 __call__</span>
<span class="cm-comment"># 所以 a 是不可以被调用的</span>
<span class="cm-keyword">try</span>:
<span class="cm-variable">a</span>()
<span class="cm-keyword">except</span> <span class="cm-variable">Exception</span> <span class="cm-keyword">as</span> <span class="cm-variable">e</span>:
<span class="cm-comment"># 告诉我们 A 的实例对象不可以被调用</span>
<span class="cm-builtin cm-error">print</span>(<span class="cm-variable">e</span>) <span class="cm-comment"># 'A' object is not callable </span>
<span class="cm-comment"># 如果我们给 A 设置了一个 __call__</span>
<span class="cm-builtin">type</span>.<span class="cm-property">__setattr__</span>(<span class="cm-variable">A</span>, <span class="cm-string">"__call__"</span>, <span class="cm-keyword">lambda</span> <span class="cm-variable-2">self</span>: <span class="cm-string">"这是__call__"</span>)
<span class="cm-comment"># 发现可以调用了</span>
<span class="cm-builtin">print</span>(<span class="cm-variable">a</span>()) <span class="cm-comment"># 这是__call__</span>
我们看到这就是动态语言的特性,即便在类创建完毕之后,依旧可以通过type进行动态设置,而这在静态语言中是不支持的。所以type是所有类的元类,它控制了我们自定义类的生成过程,type这个古老而又强大的类可以让我们玩出很多新花样。
但是对于内置的类,type是不可以对其动态增加、删除或者修改属性的,因为内置的类在底层是静态定义好的。因为从源码中我们看到,这些内置的类、包括元类,它们都是PyTypeObject对象,在底层已经被声明为全局变量了,或者说它们已经作为静态类存在了。所以type虽然是所有类型对象的元类,但是只有在面对我们自定义的类,type才具有增删改的能力。
而且我们也解释过,Python 的动态性是解释器将字节码翻译成 C 代码的时候动态赋予的,因此给类动态设置属性或方法只适用于动态类,也就是在 py 文件中使用 class 关键字定义的类。
而对于静态类、或者编写扩展模块时定义的扩展类(两者是等价的),它们在编译之后已经是指向 C 一级的数据结构了,不需要再被解释器解释了,因此解释器自然也就无法在它们身上动手脚,毕竟彪悍的人生不需要解释。
<span class="cm-keyword">try</span>:
<span class="cm-builtin">type</span>.<span class="cm-property">__setattr__</span>(<span class="cm-builtin">dict</span>, <span class="cm-string">"__call__"</span>, <span class="cm-keyword">lambda</span> <span class="cm-variable-2">self</span>: <span class="cm-string">"这是__call__"</span>)
<span class="cm-keyword">except</span> <span class="cm-variable">Exception</span> <span class="cm-keyword">as</span> <span class="cm-variable">e</span>:
<span class="cm-builtin">print</span>(<span class="cm-variable">e</span>) <span class="cm-comment"># can't set attributes of built-in/extension type 'dict'</span>
我们看到抛异常了,提示我们不可以给内置/扩展类型dict设置属性,因为它们绕过了解释器解释执行这一步,所以其属性不能被动态设置。
同理其实例对象亦是如此,静态类的实例对象也不可以动态设置属性:
<span class="cm-keyword">class</span> <span class="cm-def">Girl</span>:
<span class="cm-keyword">pass</span>
<span class="cm-variable">g</span> <span class="cm-operator">=</span> <span class="cm-variable">Girl</span>()
<span class="cm-variable">g</span>.<span class="cm-property">name</span> <span class="cm-operator">=</span> <span class="cm-string">"古明地觉"</span>
<span class="cm-comment"># 实例对象我们也可以手动设置属性</span>
<span class="cm-builtin">print</span>(<span class="cm-variable">g</span>.<span class="cm-property">name</span>) <span class="cm-comment"># 古明地觉</span>
<span class="cm-variable">lst</span> <span class="cm-operator">=</span> <span class="cm-builtin">list</span>()
<span class="cm-keyword">try</span>:
<span class="cm-variable">lst</span>.<span class="cm-property">name</span> <span class="cm-operator">=</span> <span class="cm-string">"古明地觉"</span>
<span class="cm-keyword">except</span> <span class="cm-variable">Exception</span> <span class="cm-keyword">as</span> <span class="cm-variable">e</span>:
<span class="cm-comment"># 但是内置类型的实例对象是不可以的</span>
<span class="cm-builtin cm-error">print</span>(<span class="cm-variable">e</span>) <span class="cm-comment"># 'list' object has no attribute 'name'</span>
可能有人奇怪了,为什么列表不行呢?答案是内置类型的实例对象没有__dict__属性字典,因为相关属性或方法底层已经定义好了,不可以动态添加。如果我们自定义类的时候,设置了__slots__,那么效果和内置的类是相同的。
当然了,我们后面会介绍如何通过动态修改解释器来改变这一点,举个栗子,不是说静态类无法动态设置属性吗?下面我就来打自己脸:
<span class="cm-keyword">import</span> <span class="cm-variable">gc</span>
<span class="cm-keyword">try</span>:
<span class="cm-builtin">type</span>.<span class="cm-property">__setattr__</span>(<span class="cm-builtin">list</span>, <span class="cm-string">"ping"</span>, <span class="cm-string">"pong"</span>)
<span class="cm-keyword">except</span> <span class="cm-variable">TypeError</span> <span class="cm-keyword">as</span> <span class="cm-variable">e</span>:
<span class="cm-builtin">print</span>(<span class="cm-variable">e</span>) <span class="cm-comment"># can't set attributes of built-in/extension type 'list'</span>
<span class="cm-comment"># 我们看到无法设置,那么我们就来改变这一点</span>
<span class="cm-variable">attrs</span> <span class="cm-operator">=</span> <span class="cm-variable">gc</span>.<span class="cm-property">get_referents</span>(<span class="cm-builtin">tuple</span>.<span class="cm-property">__dict__</span>)[<span class="cm-number">0</span>]
<span class="cm-variable">attrs</span>[<span class="cm-string">"ping"</span>] <span class="cm-operator">=</span> <span class="cm-string">"pong"</span>
<span class="cm-builtin">print</span>(().<span class="cm-property">ping</span>) <span class="cm-comment"># pong</span>
<span class="cm-variable">attrs</span>[<span class="cm-string">"append"</span>] <span class="cm-operator">=</span> <span class="cm-keyword">lambda</span> <span class="cm-variable-2">self</span>, <span class="cm-variable">item</span>: <span class="cm-variable-2">self</span> <span class="cm-operator">+</span> (<span class="cm-variable">item</span>,)
<span class="cm-builtin">print</span>(
().<span class="cm-property">append</span>(<span class="cm-number">1</span>).<span class="cm-property">append</span>(<span class="cm-number">2</span>).<span class="cm-property">append</span>(<span class="cm-number">3</span>)
) <span class="cm-comment"># (1, 2, 3)</span>
我脸肿了。好吧,其实这只是我们玩的一个小把戏,当我们介绍完整个 CPython 的时候,会来专门聊一聊如何动态修改解释器。比如:让元组变得可修改,让 Python 真正利用多核等等。
从解释器的角度看对象的调用
我们以内置类型 float 为例,我们说创建一个 PyFloatObject,可以通过3.14或者float(3.14)的方式。前者使用Python/C API创建,3.14直接被解析为 C 一级数据结构,也就是PyFloatObject实例;后者使用类型对象创建,通过对float进行一个调用、将3.14作为参数,最终也得到指向C一级数据结构PyFloatObject实例。
Python/C API的创建方式我们已经很清晰了,就是根据值来推断在底层应该对应哪一种数据结构,然后直接创建即可。我们重点看一下通过类型调用来创建实例对象的方式。
如果一个对象可以被调用,它的类型对象中一定要有tp_call(更准确的说成员tp_call的值是一个函数指针,不可以是0),而PyFloat_Type是可以调用的,这就说明PyType_Type内部的tp_call是一个函数指针,这在Python的层面上我们已经验证过了,下面我们再来通过源码看一下。
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">typeobject</span>.<span class="cm-property">c</span>
<span class="cm-variable">PyTypeObject</span> <span class="cm-variable">PyType_Type</span> <span class="cm-operator">=</span> {
<span class="cm-variable">PyVarObject_HEAD_INIT</span>(<span class="cm-operator">&</span><span class="cm-variable">PyType_Type</span>, <span class="cm-number">0</span>)
<span class="cm-string">"type"</span>, <span class="cm-operator">/</span><span class="cm-operator">*</span> <span class="cm-variable">tp_name</span> <span class="cm-operator">*</span><span class="cm-operator">/</span>
<span class="cm-variable">sizeof</span>(<span class="cm-variable">PyHeapTypeObject</span>), <span class="cm-operator">/</span><span class="cm-operator">*</span> <span class="cm-variable">tp_basicsize</span> <span class="cm-operator">*</span><span class="cm-operator">/</span>
<span class="cm-variable">sizeof</span>(<span class="cm-variable">PyMemberDef</span>), <span class="cm-operator">/</span><span class="cm-operator">*</span> <span class="cm-variable">tp_itemsize</span> <span class="cm-operator">*</span><span class="cm-operator">/</span>
(<span class="cm-variable">destructor</span>)<span class="cm-variable">type_dealloc</span>, <span class="cm-operator">/</span><span class="cm-operator">*</span> <span class="cm-variable">tp_dealloc</span> <span class="cm-operator">*</span><span class="cm-operator">/</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-operator">...</span> <span class="cm-operator">/</span><span class="cm-operator">*</span> <span class="cm-variable">tp_hash</span> <span class="cm-operator">*</span><span class="cm-operator">/</span>
(<span class="cm-variable">ternaryfunc</span>)<span class="cm-variable">type_call</span>, <span class="cm-operator">/</span><span class="cm-operator">*</span> <span class="cm-variable">tp_call</span> <span class="cm-operator">*</span><span class="cm-operator">/</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-operator">...</span>
}
我们看到在实例化PyType_Type的时候PyTypeObject内部的成员tp_call被设置成了type_call。这是一个函数指针,当我们调用PyFloat_Type的时候,会触发这个type_call指向的函数。
因此 float(3.14) 在C的层面上等价于:
(<span class="cm-operator">&</span><span class="cm-variable">PyFloat_Type</span>) <span class="cm-operator">-</span><span class="cm-operator">></span> <span class="cm-variable">ob_type</span> <span class="cm-operator">-</span><span class="cm-operator">></span> <span class="cm-variable">tp_call</span>(<span class="cm-operator">&</span><span class="cm-variable">PyFloat_Type</span>, <span class="cm-variable">args</span>, <span class="cm-variable">kwargs</span>);
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">即:</span>
(<span class="cm-operator">&</span><span class="cm-variable">PyType_Type</span>) <span class="cm-operator">-</span><span class="cm-operator">></span> <span class="cm-variable">tp_call</span>(<span class="cm-operator">&</span><span class="cm-variable">PyFloat_Type</span>, <span class="cm-variable">args</span>, <span class="cm-variable">kwargs</span>);
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">而在创建</span> <span class="cm-variable">PyType_Type</span> <span class="cm-variable">的时候,给</span> <span class="cm-variable">tp_call</span> <span class="cm-variable">成员传递的是</span> <span class="cm-variable">type_call</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">因此最终相当于</span>
<span class="cm-variable">type_call</span>(<span class="cm-operator">&</span><span class="cm-variable">PyFloat_Type</span>, <span class="cm-variable">args</span>, <span class="cm-variable">kwargs</span>)
如果用 Python 来演示这一过程的话:
<span class="cm-comment"># float(3.14),等价于</span>
<span class="cm-variable">f1</span> <span class="cm-operator">=</span> <span class="cm-builtin">float</span>.<span class="cm-property">__class__</span>.<span class="cm-property">__call__</span>(<span class="cm-builtin">float</span>, <span class="cm-number">3.14</span>)
<span class="cm-comment"># 等价于</span>
<span class="cm-variable">f2</span> <span class="cm-operator">=</span> <span class="cm-builtin">type</span>.<span class="cm-property">__call__</span>(<span class="cm-builtin">float</span>, <span class="cm-number">3.14</span>)
<span class="cm-builtin">print</span>(<span class="cm-variable">f1</span>, <span class="cm-variable">f2</span>) <span class="cm-comment"># 3.14 3.14</span>
这就是 float(3.14) 的秘密,相信list、dict在实例化的时候是怎么做的,你已经猜到了,做法是相同的。
<span class="cm-comment"># lst = list("abcd")</span>
<span class="cm-variable">lst</span> <span class="cm-operator">=</span> <span class="cm-builtin">list</span>.<span class="cm-property">__class__</span>.<span class="cm-property">__call__</span>(<span class="cm-builtin">list</span>, <span class="cm-string">"abcd"</span>)
<span class="cm-builtin">print</span>(<span class="cm-variable">lst</span>) <span class="cm-comment"># ['a', 'b', 'c', 'd']</span>
<span class="cm-comment"># dct = dict([("name", "古明地觉"), ("age", 17)])</span>
<span class="cm-variable">dct</span> <span class="cm-operator">=</span> <span class="cm-builtin">dict</span>.<span class="cm-property">__class__</span>.<span class="cm-property">__call__</span>(<span class="cm-builtin">dict</span>, [(<span class="cm-string">"name"</span>, <span class="cm-string">"古明地觉"</span>), (<span class="cm-string">"age"</span>, <span class="cm-number">17</span>)])
<span class="cm-builtin">print</span>(<span class="cm-variable">dct</span>) <span class="cm-comment"># {'name': '古明地觉', 'age': 17}</span>
最后我们来围观一下 type_call 函数,我们说 type 的 __call__ 方法,在底层对应的是 type_call 函数,它位于Object/typeobject.c中。
<span class="cm-variable">static</span> <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span>
<span class="cm-variable">type_call</span>(<span class="cm-variable">PyTypeObject</span> <span class="cm-operator">*</span><span class="cm-builtin">type</span>, <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span><span class="cm-variable">args</span>, <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span><span class="cm-variable">kwds</span>)
{
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">如果我们调用的是</span> <span class="cm-builtin">float</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">那么显然这里的</span> <span class="cm-builtin">type</span> <span class="cm-variable">就是</span> <span class="cm-operator">&</span><span class="cm-variable">PyFloat_Type</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">这里是声明一个PyObject</span> <span class="cm-operator">*</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">显然它是要返回的实例对象的指针</span>
<span class="cm-variable">PyObject</span> <span class="cm-operator">*</span><span class="cm-variable">obj</span>;
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">这里会检测</span> <span class="cm-variable">tp_new是否为空,tp_new是什么估计有人已经猜到了</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">我们说__call__对应底层的tp_call</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">显然__new__对应底层的tp_new,这里是为实例对象分配空间</span>
<span class="cm-keyword">if</span> (<span class="cm-builtin">type</span><span class="cm-operator">-</span><span class="cm-operator">></span><span class="cm-variable">tp_new</span> <span class="cm-operator">==</span> <span class="cm-variable">NULL</span>) {
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">tp_new</span> <span class="cm-variable">是一个函数指针,指向具体的构造函数</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">如果</span> <span class="cm-variable">tp_new</span> <span class="cm-variable">为空,说明它没有构造函数</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span> <span class="cm-variable">因此会报错,表示无法创建其实例</span>
<span class="cm-variable">PyErr_Format</span>(<span class="cm-variable">PyExc_TypeError</span>,
<span class="cm-string">"cannot create '%.100s' instances"</span>,
<span class="cm-builtin">type</span><span class="cm-operator">-</span><span class="cm-operator">></span><span class="cm-variable">tp_name</span>);
<span class="cm-keyword">return</span> <span class="cm-variable">NULL</span>;
}
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">通过tp_new分配空间</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">此时实例对象就已经创建完毕了,这里会返回其指针</span>
<span class="cm-variable">obj</span> <span class="cm-operator">=</span> <span class="cm-builtin">type</span><span class="cm-operator">-</span><span class="cm-operator">></span><span class="cm-variable">tp_new</span>(<span class="cm-builtin">type</span>, <span class="cm-variable">args</span>, <span class="cm-variable">kwds</span>);
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">类型检测,暂时不用管</span>
<span class="cm-variable">obj</span> <span class="cm-operator">=</span> <span class="cm-variable">_Py_CheckFunctionResult</span>((<span class="cm-variable">PyObject</span><span class="cm-operator">*</span>)<span class="cm-builtin">type</span>, <span class="cm-variable">obj</span>, <span class="cm-variable">NULL</span>);
<span class="cm-keyword">if</span> (<span class="cm-variable">obj</span> <span class="cm-operator">==</span> <span class="cm-variable">NULL</span>)
<span class="cm-keyword">return</span> <span class="cm-variable">NULL</span>;
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">我们说这里的参数type是类型对象,但也可以是元类</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">元类也是由PyTypeObject结构体实例化得到的</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">元类在调用的时候执行的依旧是type_call</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">所以这里是检测type指向的是不是PyType_Type</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">如果是的话,那么实例化得到的obj就不是实例对象了,而是类型对象</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">要单独检测一下</span>
<span class="cm-keyword">if</span> (<span class="cm-builtin">type</span> <span class="cm-operator">==</span> <span class="cm-operator">&</span><span class="cm-variable">PyType_Type</span> <span class="cm-operator">&</span><span class="cm-operator">&</span>
<span class="cm-variable">PyTuple_Check</span>(<span class="cm-variable">args</span>) <span class="cm-operator">&</span><span class="cm-operator">&</span> <span class="cm-variable">PyTuple_GET_SIZE</span>(<span class="cm-variable">args</span>) <span class="cm-operator">==</span> <span class="cm-number">1</span> <span class="cm-operator">&</span><span class="cm-operator">&</span>
(<span class="cm-variable">kwds</span> <span class="cm-operator">==</span> <span class="cm-variable">NULL</span> <span class="cm-operator">|</span><span class="cm-operator">|</span>
(<span class="cm-variable">PyDict_Check</span>(<span class="cm-variable">kwds</span>) <span class="cm-operator">&</span><span class="cm-operator">&</span> <span class="cm-variable">PyDict_GET_SIZE</span>(<span class="cm-variable">kwds</span>) <span class="cm-operator">==</span> <span class="cm-number">0</span>)))
<span class="cm-keyword">return</span> <span class="cm-variable">obj</span>;
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">tp_new应该返回相应类型对象的实例对象</span>(<span class="cm-variable">的指针</span>)
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">但如果不是,就直接将这里的obj返回</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">此处这么做可能有点难理解,我们一会细说</span>
<span class="cm-keyword">if</span> (<span class="cm-operator">!</span><span class="cm-variable">PyType_IsSubtype</span>(<span class="cm-variable">Py_TYPE</span>(<span class="cm-variable">obj</span>), <span class="cm-builtin">type</span>))
<span class="cm-keyword">return</span> <span class="cm-variable">obj</span>;
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">拿到obj的类型</span>
<span class="cm-builtin">type</span> <span class="cm-operator">=</span> <span class="cm-variable">Py_TYPE</span>(<span class="cm-variable">obj</span>);
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">执行</span> <span class="cm-variable">tp_init</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">显然这个tp_init就是__init__函数</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">这与Python中类的实例化过程是一致的。</span>
<span class="cm-keyword">if</span> (<span class="cm-builtin">type</span><span class="cm-operator">-</span><span class="cm-operator">></span><span class="cm-variable">tp_init</span> <span class="cm-operator">!=</span> <span class="cm-variable">NULL</span>) {
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">将tp_new返回的对象作为self,执行</span> <span class="cm-variable">tp_init</span>
<span class="cm-builtin">int</span> <span class="cm-variable">res</span> <span class="cm-operator">=</span> <span class="cm-builtin">type</span><span class="cm-operator">-</span><span class="cm-operator">></span><span class="cm-variable">tp_init</span>(<span class="cm-variable">obj</span>, <span class="cm-variable">args</span>, <span class="cm-variable">kwds</span>);
<span class="cm-keyword">if</span> (<span class="cm-variable">res</span> <span class="cm-operator"><</span> <span class="cm-number">0</span>) {
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">执行失败,将引入计数减1,然后将obj设置为NULL</span>
<span class="cm-keyword">assert</span>(<span class="cm-variable">PyErr_Occurred</span>());
<span class="cm-variable">Py_DECREF</span>(<span class="cm-variable">obj</span>);
<span class="cm-variable">obj</span> <span class="cm-operator">=</span> <span class="cm-variable">NULL</span>;
}
<span class="cm-keyword">else</span> {
<span class="cm-keyword">assert</span>(<span class="cm-operator">!</span><span class="cm-variable">PyErr_Occurred</span>());
}
}
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">返回obj</span>
<span class="cm-keyword">return</span> <span class="cm-variable">obj</span>;
}
因此从上面我们可以看到关键的部分有两个:
- 调用类型对象的 tp_new 指向的函数为实例对象申请内存
- 调用 tp_init 指向的函数为实例对象进行初始化,也就是设置属性
所以这对应Python中的__new__和__init__,我们说__new__是为实例对象开辟一份内存,然后返回指向这片内存(对象)的指针,并且该指针会自动传递给__init__中的self。
<span class="cm-keyword">class</span> <span class="cm-def">Girl</span>:
<span class="cm-keyword">def</span> <span class="cm-def">__new__</span>(<span class="cm-variable-2">cls</span>, <span class="cm-variable">name</span>, <span class="cm-variable">age</span>):
<span class="cm-builtin">print</span>(<span class="cm-string">"__new__方法执行啦"</span>)
<span class="cm-comment"># 写法非常固定</span>
<span class="cm-comment"># 调用object.__new__(cls)就会创建Girl的实例对象</span>
<span class="cm-comment"># 因此这里的cls指的就是这里的Girl,注意:一定要返回</span>
<span class="cm-comment"># 因为__new__会将自己的返回值交给__init__中的self</span>
<span class="cm-keyword cm-error">return</span> <span class="cm-builtin">object</span>.<span class="cm-property">__new__</span>(<span class="cm-variable-2">cls</span>)
<span class="cm-keyword cm-error">def</span> <span class="cm-def">__init__</span>(<span class="cm-variable-2">self</span>, <span class="cm-variable">name</span>, <span class="cm-variable">age</span>):
<span class="cm-builtin">print</span>(<span class="cm-string">"__init__方法执行啦"</span>)
<span class="cm-variable-2">self</span>.<span class="cm-property">name</span> <span class="cm-operator">=</span> <span class="cm-variable">name</span>
<span class="cm-variable-2 cm-error">self</span>.<span class="cm-property">age</span> <span class="cm-operator">=</span> <span class="cm-variable">age</span>
<span class="cm-variable">g</span> <span class="cm-operator">=</span> <span class="cm-variable">Girl</span>(<span class="cm-string">"古明地觉"</span>, <span class="cm-number">16</span>)
<span class="cm-builtin">print</span>(<span class="cm-variable">g</span>.<span class="cm-property">name</span>, <span class="cm-variable">g</span>.<span class="cm-property">age</span>)
<span class="cm-string">"""</span>
<span class="cm-string">__new__方法执行啦</span>
<span class="cm-string">__init__方法执行啦</span>
<span class="cm-string">古明地觉 16</span>
<span class="cm-string">"""</span>
__new__里面的参数要和__init__里面的参数保持一致,因为我们会先执行__new__,然后解释器会将__new__的返回值和我们传递的参数组合起来一起传递给__init__。因此__new__里面的参数除了cls之外,一般都会写*args和**kwargs。
然后再回过头来看一下type_call中的这几行代码:
<span class="cm-variable">static</span> <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span>
<span class="cm-variable">type_call</span>(<span class="cm-variable">PyTypeObject</span> <span class="cm-operator">*</span><span class="cm-builtin">type</span>, <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span><span class="cm-variable">args</span>, <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span><span class="cm-variable">kwds</span>)
{
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-operator">...</span><span class="cm-operator">...</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-operator">...</span><span class="cm-operator">...</span>
<span class="cm-keyword">if</span> (<span class="cm-operator">!</span><span class="cm-variable">PyType_IsSubtype</span>(<span class="cm-variable">Py_TYPE</span>(<span class="cm-variable">obj</span>), <span class="cm-builtin">type</span>))
<span class="cm-keyword">return</span> <span class="cm-variable">obj</span>;
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-operator">...</span><span class="cm-operator">...</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-operator">...</span><span class="cm-operator">...</span>
}
我们说tp_new应该返回该类型对象的实例对象,而且一般情况下我们是不写__new__的,会默认执行。但是我们一旦重写了,那么必须要手动返回object.__new__(cls)。可如果我们不返回,或者返回其它的话,会怎么样呢?
<span class="cm-keyword">class</span> <span class="cm-def">Girl</span>:
<span class="cm-keyword">def</span> <span class="cm-def">__new__</span>(<span class="cm-variable-2">cls</span>, <span class="cm-operator">*</span><span class="cm-variable">args</span>, <span class="cm-operator">*</span><span class="cm-operator">*</span><span class="cm-variable">kwargs</span>):
<span class="cm-builtin">print</span>(<span class="cm-string">"__new__方法执行啦"</span>)
<span class="cm-variable cm-error">instance</span> <span class="cm-operator">=</span> <span class="cm-builtin">object</span>.<span class="cm-property">__new__</span>(<span class="cm-variable-2">cls</span>)
<span class="cm-comment"># 打印看看instance到底是个什么东东</span>
<span class="cm-builtin cm-error">print</span>(<span class="cm-string">"instance:"</span>, <span class="cm-variable">instance</span>)
<span class="cm-builtin">print</span>(<span class="cm-string">"type(instance):"</span>, <span class="cm-builtin">type</span>(<span class="cm-variable">instance</span>))
<span class="cm-comment"># 正确做法是将instance返回</span>
<span class="cm-comment"># 但是我们不返回, 而是返回个 123</span>
<span class="cm-keyword cm-error">return</span> <span class="cm-number">123</span>
<span class="cm-keyword cm-error">def</span> <span class="cm-def">__init__</span>(<span class="cm-variable-2">self</span>, <span class="cm-variable">name</span>, <span class="cm-variable">age</span>):
<span class="cm-builtin">print</span>(<span class="cm-string">"__init__方法执行啦"</span>)
<span class="cm-variable">g</span> <span class="cm-operator">=</span> <span class="cm-variable">Girl</span>()
<span class="cm-string">"""</span>
<span class="cm-string">__new__方法执行啦</span>
<span class="cm-string">instance: <__main__.Girl object at 0x000002C0F16FA1F0></span>
<span class="cm-string">type(instance): <class '__main__.Girl'></span>
<span class="cm-string">"""</span>
这里面有很多可以说的点,首先就是 __init__ 里面需要两个参数,但是我们没有传,却还不报错。原因就在于这个 __init__ 压根就没有执行,因为 __new__ 返回的不是 Girl 的实例对象。
通过打印 instance,我们知道了object.__new__(cls) 返回的就是 cls 的实例对象,而这里的cls就是Girl这个类本身。我们必须要返回instance,才会执行对应的__init__,否则__new__直接就返回了。我们在外部来打印一下创建的实例对象吧,看看结果:
<span class="cm-keyword">class</span> <span class="cm-def">Girl</span>:
<span class="cm-keyword">def</span> <span class="cm-def">__new__</span>(<span class="cm-variable-2">cls</span>, <span class="cm-operator">*</span><span class="cm-variable">args</span>, <span class="cm-operator">*</span><span class="cm-operator">*</span><span class="cm-variable">kwargs</span>):
<span class="cm-keyword">return</span> <span class="cm-number">123</span>
<span class="cm-keyword cm-error">def</span> <span class="cm-def">__init__</span>(<span class="cm-variable-2">self</span>, <span class="cm-variable">name</span>, <span class="cm-variable">age</span>):
<span class="cm-builtin">print</span>(<span class="cm-string">"__init__方法执行啦"</span>)
<span class="cm-variable">g</span> <span class="cm-operator">=</span> <span class="cm-variable">Girl</span>()
<span class="cm-builtin">print</span>(<span class="cm-variable">g</span>, <span class="cm-builtin">type</span>(<span class="cm-variable">g</span>)) <span class="cm-comment"># 123 <class 'int'></span>
我们看到打印的是123,所以再次总结一些tp_new和tp_init之间的区别,当然也对应__new__和__init__的区别:
- tp_new:为该类型对象的实例对象申请内存,在Python的__new__方法中通过object.__new__(cls)的方式申请,然后将其返回
- tp_init:tp_new的返回值会自动传递给self,然后为self绑定相应的属性,也就是进行实例对象的初始化
但如果tp_new返回的不是对应类型的实例对象的指针,比如type_call中第一个参数接收的&PyFloat_Type,但是tp_new中返回的却是PyLongObject *,所以此时就不会执行tp_init。
以上面的代码为例,我们Girl中的__new__应该返回Girl的实例对象才对,但实际上返回了整型,因此类型不一致,所以不会执行__init__。
下面我们可以做总结了,通过类型对象去创建实例对象的整体流程如下:
- 第一步:获取类型对象的类型对象,说白了就是元类,执行元类的 tp_call 指向的函数,即 type_call
- 第二步:type_call 会调用该类型对象的 tp_new 指向的函数,如果 tp_new 为 NULL,那么会到 tp_base 指定的父类里面去寻找 tp_new。在新式类当中,所有的类都继承自 object,因此最终会执行 object 的 __new__。然后通过访问对应类型对象中的 tp_basicsize 信息,这个信息记录着该对象的实例对象需要占用多大的内存,继而完成申请内存的操作
- 调用type_new 创建完对象之后,就会进行实例对象的初始化,会将指向这片空间的指针交给 tp_init,但前提是 tp_new 返回的实例对象的类型要一致。
所以都说 Python 在实例化的时候会先调用 __new__ 方法,再调用 __init__ 方法,相信你应该知道原因了,因为在源码中先调用 tp_new、再调用的 tp_init。
<span class="cm-variable">static</span> <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span>
<span class="cm-variable">type_call</span>(<span class="cm-variable">PyTypeObject</span> <span class="cm-operator">*</span><span class="cm-builtin">type</span>, <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span><span class="cm-variable">args</span>, <span class="cm-variable">PyObject</span> <span class="cm-operator">*</span><span class="cm-variable">kwds</span>)
{
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">调用__new__方法,</span> <span class="cm-variable">拿到其返回值</span>
<span class="cm-variable">obj</span> <span class="cm-operator">=</span> <span class="cm-builtin">type</span><span class="cm-operator">-</span><span class="cm-operator">></span><span class="cm-variable">tp_new</span>(<span class="cm-builtin">type</span>, <span class="cm-variable">args</span>, <span class="cm-variable">kwds</span>);
<span class="cm-keyword">if</span> (<span class="cm-builtin">type</span><span class="cm-operator">-</span><span class="cm-operator">></span><span class="cm-variable">tp_init</span> <span class="cm-operator">!=</span> <span class="cm-variable">NULL</span>) {
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">将__new__返回的实例obj,和args、kwds组合起来</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">一起传给</span> <span class="cm-variable">__init__</span>
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-variable">其中</span> <span class="cm-variable">obj</span> <span class="cm-variable">会传给</span> <span class="cm-variable-2">self</span><span class="cm-variable">,</span>
<span class="cm-builtin">int</span> <span class="cm-variable">res</span> <span class="cm-operator">=</span> <span class="cm-builtin">type</span><span class="cm-operator">-</span><span class="cm-operator">></span><span class="cm-variable">tp_init</span>(<span class="cm-variable">obj</span>, <span class="cm-variable">args</span>, <span class="cm-variable">kwds</span>);
<span class="cm-operator">/</span><span class="cm-operator">/</span><span class="cm-operator">...</span><span class="cm-operator">...</span>
<span class="cm-keyword">return</span> <span class="cm-variable">obj</span>;
}
所以源码层面表现出来的,和我们在 Python 层面看到的是一样的。
小结
到此,我们就从 Python 和解释器两个层面了解了对象是如何调用的,更准确的说我们是从解释器的角度对 Python 层面的知识进行了验证,通过 tp_new 和 tp_init 的关系,来了解 __new__ 和 __init__ 的关系。
另外,对象调用远不止我们目前说的这么简单,更多的细节隐藏在了幕后,只不过现在没办法将其一次性全部挖掘出来。
文章评论