概率论与数理统计-下篇
4 一维随机变量
4.1 分布函数
4.1.1 定义与充要条件
定义:
F X ( x ) = P { X ⩽ x } , x ∈ R F_{_{X}}(x)=P\{X\leqslant x\}\ ,\ x\in\mathbb{R}FX(x)=P{X⩽x} , x∈R
充要条件:
值域:0 = F ( − ∞ ) ⩽ F ( x ) ⩽ F ( + ∞ ) = 1 0=F(-\infty)\leqslant F(x) \leqslant F(+\infty)=10=F(−∞)⩽F(x)⩽F(+∞)=1
单调不减:∀ x 1 < x 2 , F ( x 1 ) ⩽ F ( x 2 ) \forall x_1<x_2\ ,\ F(x_1)\leqslant F(x_2)∀x1<x2 , F(x1)⩽F(x2)
右连续性:F ( x ) = F ( x + 0 ) F(x)=F(x+0)F(x)=F(x+0)
4.1.2 概率计算
∀ x ∈ R , P { X = x 0 } = F ( x 0 ) − F ( x 0 − 0 ) = F ∈ C ( U ( x 0 , δ ) ) 0 \forall x\in\mathbb{R},\ P\{X=x_0\}=F(x_0)-F(x_0-0)\xlongequal{F\in C(U(x_0,\delta))}0∀x∈R, P{X=x0}=F(x0)−F(x0−0)F∈C(U(x0,δ))0
∀ x 1 < x 2 , P { x 1 < X ⩽ x 2 } = P { X ⩽ x 2 } − P { X ⩽ x 1 } = F ( x 2 ) − F ( x 1 ) \forall x_1<x_2,\ P\{x_1<X\leqslant x_2\}=P\{X\leqslant x_2\}-P\{X\leqslant x_1\}=F(x_2)-F(x_1)∀x1<x2, P{x1<X⩽x2}=P{X⩽x2}−P{X⩽x1}=F(x2)−F(x1)
4.1.3 分布函数判别
F ( a x + b ) , a > 0 F(ax+b),a>0F(ax+b),a>0 仍为分布函数,a < 0 a<0a<0 不是分布函数.
a 1 F ( x ) + a 2 F ( x ) , a 1 + a 2 = 1 , a i ⩾ 0 a_1F(x)+a_2F(x),a_1+a_2=1,a_i\geqslant 0a1F(x)+a2F(x),a1+a2=1,ai⩾0 仍为分布函数.
F 1 ( x ) F 2 ( x ) , 1 − [ 1 − F 1 ( x ) ] [ 1 − F 2 ( x ) ] F_1(x)F_2(x)\ ,\ 1-[1-F_1(x)][1-F_2(x)]F1(x)F2(x) , 1−[1−F1(x)][1−F2(x)] 仍为分布函数.
4.1.4 最值函数
最大值:
F Z ( z ) = P { max ( X , Y ) ⩽ z } = P { X ⩽ z , Y ⩽ z } = X , Y 互独 P { X ⩽ z } P { Y ⩽ z } = F X ( z ) F Y ( z ) = X , Y 同分布 [ F X ( z ) ] 2 \begin{aligned} F_{_Z}(z)&=P\{\max(X,Y)\leqslant z\}=P\{X\leqslant z,Y\leqslant z\} \\&\xlongequal{X,Y互独}P\{X\leqslant z\}P\{Y\leqslant z\} =F_{_X}(z)F_{_Y}(z) \\&\xlongequal{X,Y同分布}[F_{_X}(z)]^2 \end{aligned}FZ(z)=P{max(X,Y)⩽z}=P{X⩽z,Y⩽z}X,Y互独P{X⩽z}P{Y⩽z}=FX(z)FY(z)X,Y同分布[FX(z)]2
U = max { X 1 , X 2 , ⋯ , X n } , F U ( x ) = X i 独立同分布 [ F X ( x ) ] n U=\max\{X_1,X_2,\cdots,X_n\}\ ,\ F_{_U}(x)\xlongequal{X_i独立同分布}[F_{_X}(x)]^nU=max{X1,X2,⋯,Xn} , FU(x)Xi独立同分布[FX(x)]n
最小值:
F Z ( z ) = P { min ( X , Y ) ⩽ z } = P { ( X ⩽ z ) ∪ ( Y ⩽ z ) } = X , Y 互独 P { X ⩽ z } + P { Y ⩽ z } − P { X ⩽ z , Y ⩽ z } = F X ( z ) + F Y ( z ) − F ( z , z ) = X , Y 互独 F X ( z ) + F Y ( z ) − F X ( z ) F Y ( z ) \begin{aligned} F_Z(z)&=P\{\min(X,Y)\leqslant z\}=P\{(X\leqslant z)\cup (Y\leqslant z)\} \\&\xlongequal{X,Y互独}P\{X\leqslant z\}+P\{Y\leqslant z\}-P\{X\leqslant z,Y\leqslant z\} \\&=F_X(z)+F_Y(z)-F(z,z) \\&\xlongequal{X,Y互独}F_X(z)+F_Y(z)-F_X(z)F_Y(z) \end{aligned}FZ(z)=P{min(X,Y)⩽z}=P{(X⩽z)∪(Y⩽z)}X,Y互独P{X⩽z}+P{Y⩽z}−P{X⩽z,Y⩽z}=FX(z)+FY(z)−F(z,z)X,Y互独FX(z)+FY(z)−FX(z)FY(z)
F Z ( z ) = P { min ( X , Y ) ⩽ z } = 1 − P { X > z , Y > z } = X , Y 互独 1 − P { X > z } P { Y > z } = 1 − [ 1 − F X ( z ) ] [ 1 − F Y ( z ) ] = X , Y 互独 1 − [ 1 − F X ( z ) ] 2 \begin{aligned} F_{_Z}(z)&=P\{\min(X,Y)\leqslant z\}=1-P\{X>z,Y>z\} \\&\xlongequal{X,Y互独}1-P\{X>z\}P\{Y>z\} \\&=1-[1-F_{_X}(z)][1-F_{_Y}(z)]\xlongequal{X,Y互独}1-[1-F_{_X}(z)]^2 \end{aligned}FZ(z)=P{min(X,Y)⩽z}=1−P{X>z,Y>z}X,Y互独1−P{X>z}P{Y>z}=1−[1−FX(z)][1−FY(z)]X,Y互独1−[1−FX(z)]2
V = max { X 1 , X 2 , ⋯ , X n } , F V ( x ) = X i 独立同分布 1 − [ 1 − F X ( x ) ] n V=\max\{X_1,X_2,\cdots,X_n\}\ ,F_{_V}(x)\xlongequal{X_i\ 独立同分布}1-[1-F_{_X}(x)]^nV=max{X1,X2,⋯,Xn} ,FV(x)Xi 独立同分布1−[1−FX(x)]n
4.2 随机变量
4.2.1 离散型
4.2.1.1 分布律及充要条件
X ∼ P { X = x i } = p i ( i ∈ N + ) ⇔ { 非负性: p i ⩾ 0 正则性 ( 规范性 ) : ∑ i = 1 ∞ p i = 1 X\sim P\{X=x_i\}=p_i\ (i\in\mathbb{N}^+)\ \Leftrightarrow \left\{\begin{array}{l} 非负性:p_i\geqslant 0\\ \displaystyle 正则性(规范性):\sum\limits_{i=1}^{\infty}p_i=1\\ \end{array}\right.X∼P{X=xi}=pi (i∈N+) ⇔⎩⎨⎧非负性:pi⩾0正则性(规范性):i=1∑∞pi=1
4.2.1.2 分布函数
F ( x ) = P { X ⩽ x } = ∑ x i ⩽ x P { X = x i } F(x)=P\{X\leqslant x\}=\sum_{x_i\leqslant x}P\{X=x_i\}F(x)=P{X⩽x}=xi⩽x∑P{X=xi}
4.2.2 连续型
4.2.2.1 概率密度
充要条件:
X ∼ f X ( x ) ⇔ { 非负性 : f ( x ) ⩾ 0 正则性 ( 规范性 ) : ∫ − ∞ + ∞ f ( x ) d x = 1 X\sim f_{_X}(x) \Leftrightarrow \left\{\begin{array}{l} 非负性: f(x) \geqslant 0\\ 正则性(规范性): \displaystyle\int_{-\infty}^{+\infty}f(x)\mathrm{d}x=1\\ \end{array}\right.X∼fX(x)⇔⎩⎨⎧非负性:f(x)⩾0正则性(规范性):∫−∞+∞f(x)dx=1
性质:
∀ x 1 < x 2 , P { x 1 < X ⩽ x 2 } = F ( x 2 ) − F ( x 1 ) = ∫ x 1 x 2 f ( t ) d t \displaystyle \forall x_1<x_2\ ,P\{x_1<X\leqslant x_2\}=F(x_2)-F(x_1)=\int_{x_1}^{x_2}f(t)\mathrm{d}t∀x1<x2 ,P{x1<X⩽x2}=F(x2)−F(x1)=∫x1x2f(t)dt
有限个点不影响区间面积,即概率,故
(1) P { X = x 0 } = F ( x 0 ) − F ( x 0 − 0 ) = F 在 x 0 处连续 0 P\{X=x_0\}=F(x_0)-F(x_0-0)\xlongequal{F在x_0处连续} 0P{X=x0}=F(x0)−F(x0−0)F在x0处连续0
(2) P { a < X < b } = P { a ⩽ x ⩽ b } = P { a < x ⩽ b } = P { a ⩽ x < b } P\{a<X<b\}=P\{a\leqslant x\leqslant b\}=P\{a<x\leqslant b\}=P\{a\leqslant x<b\}P{a<X<b}=P{a⩽x⩽b}=P{a<x⩽b}=P{a⩽x<b}f ( x ) f(x)f(x) 在 x xx 处连续,则 F ′ ( x ) = f ( x ) F'(x)=f(x)F′(x)=f(x)
(1) F ( x ) F(x)F(x) 必连续,f ( x ) f(x)f(x) 未必(函数连续可积性质)
(2) F ( x ) F(x)F(x) 不连续则非连续型,无概率密度.
例如:F ( x ) = { 0 , x < 0 x / 2 , 0 ⩽ x < 1 1 , x ⩾ 1 , P { X = 1 } = F ( 1 ) − F ( 1 − 0 ) = 1 2 \displaystyle F(x)=\left\{ \begin{array}{l}0\quad,x<0\\ x/2,0\leqslant x<1\\1\quad,x\geqslant 1\\\end{array}\right.,P\{X=1\}=F(1)-F(1-0)=\frac{1}{2}F(x)=⎩⎨⎧0,x<0x/2,0⩽x<11,x⩾1,P{X=1}=F(1)−F(1−0)=21 为混合型.
概率密度判别:
a f ( a x + b ) af(ax+b)af(ax+b) 仍为密度,因为 ∫ − ∞ + ∞ f ( a x + b ) d ( a x + b ) = 1 \displaystyle \int_{-\infty}^{+\infty}f(ax+b)\mathrm{d}(ax+b)=1∫−∞+∞f(ax+b)d(ax+b)=1.
a 1 f 1 ( x ) + a 2 f ( x 2 ) , a 1 + a 2 = 1 , a i ⩾ 0 a_1f_1(x)+a_2f(x_2),a_1+a_2=1,a_i \geqslant 0a1f1(x)+a2f(x2),a1+a2=1,ai⩾0 仍为密度 .
f 1 ( x ) ⋅ f 2 ( x ) f_1(x)\cdot f_2(x)f1(x)⋅f2(x) 未必是密度.
f 1 ( x ) F 1 ( x ) + f 2 ( x ) F 2 ( x ) = d [ F 1 ( x ) F 2 ( x ) ] d x \displaystyle f_1(x)F_1(x)+f_2(x)F_2(x)=\frac{\mathrm{d}[F_1(x)F_2(x)]}{\mathrm{d}x}f1(x)F1(x)+f2(x)F2(x)=dxd[F1(x)F2(x)] 仍为密度.
4.2.2.2 分布函数
F ( x ) = P { X ⩽ x } = ∫ − ∞ x f ( t ) d t , f ( x ) ⩾ 0 ; P { X < x } = F ( x − 0 ) F(x)=P\{X\leqslant x\}=\int_{-\infty}^{x}f(t)\mathrm{d}t,f(x)\geqslant 0\ ; \\P\{X<x\}=F(x-0)F(x)=P{X⩽x}=∫−∞xf(t)dt,f(x)⩾0 ;P{X<x}=F(x−0)
4.2.2.3 推论
X ∼ F X ( x ) ∈ C ( R ) ⇒ Y = F X ( X ) ∼ U ( 0 , 1 ) X\sim F_{_X}(x)\in C(\mathbb{R})\ \Rightarrow\ Y=F_{_X}(X)\sim U(0,1)X∼FX(x)∈C(R) ⇒ Y=FX(X)∼U(0,1)
proof:
∀ y ∈ [ 0 , 1 ] , \forall\ y\in [0,1],∀ y∈[0,1],
P { Y = F X ( X ) ⩽ y } = P { X ⩽ F X − 1 ( y ) } = F X [ F X − 1 ( y ) ] = y P\{Y=F_{_X}(X)\leqslant y\}=P\{X \leqslant F_{_X}^{-1}(y)\}=F_{_X}[F_{_X}^{-1}(y)]=yP{Y=FX(X)⩽y}=P{X⩽FX−1(y)}=FX[FX−1(y)]=y
故 Y = F X ( X ) ∼ U ( 0 , 1 ) Y=F_{_X}(X)\sim U(0,1)Y=FX(X)∼U(0,1)
4.3 Y=g(X) 概率分布
4.3.1 离散型
P { X = x i } = p i ⇒ Y = g ( X ) P { Y = g ( x i ) } = p i P\{X=x_i\}=p_i \ \xRightarrow{\ Y=g(X)\ }\ P\{Y=g(x_i)\}=p_iP{X=xi}=pi Y=g(X) P{Y=g(xi)}=pi
4.3.2 连续型
mathod1:(图像)
Y = g ( X ) ⊂ ( α , β ) Y=g(X)\subset (\alpha,\beta)Y=g(X)⊂(α,β),由 Y = y Y=yY=y 确定 g ( X ) ⩽ y g(X)\leqslant yg(X)⩽y 的 X XX 范围,
- y < α , F Y ( y ) = 0 y<\alpha,F_{_Y}(y)=0y<α,FY(y)=0
- y ⩾ β , F Y ( y ) = 1 y\geqslant \beta,F_{_Y}(y)=1y⩾β,FY(y)=1
- α ⩽ y < β , P { g ( X ) ⩽ y } = P { φ 1 ( y ) ⩽ X ⩽ φ 2 ( y ) } = ∫ φ 1 ( y ) φ 2 ( y ) f X ( t ) d t ⇒ { 可积:直接计算分布函数 不可积:变限积分求导,求解概率密度 \displaystyle\alpha\leqslant y< \beta,P\{g(X)\leqslant y\}=P\{\varphi_1(y)\leqslant X\leqslant \varphi_2(y)\}=\int_{\varphi_1(y)}^{\varphi_2(y)}f_{_X}(t)\mathrm{d}t \\ \Rightarrow \left\{\begin{array}{l}可积:直接计算分布函数\\ 不可积:变限积分求导,求解概率密度\end{array}\right.α⩽y<β,P{g(X)⩽y}=P{φ1(y)⩽X⩽φ2(y)}=∫φ1(y)φ2(y)fX(t)dt⇒{可积:直接计算分布函数不可积:变限积分求导,求解概率密度
mathod2:(公式)
F Y ( y ) = P { Y = g ( X ) ⩽ y } = P { X ⩽ g − 1 ( y ) } = F X [ g − 1 ( y ) ] = ∫ − ∞ g − 1 ( y ) f X ( t ) d t F_{_Y}(y)=P\{Y=g(X)\leqslant y\}=P\{X\leqslant g^{-1}(y)\}=F_{_X}[g^{-1}(y)]=\int_{-\infty}^{g^{-1}(y)}f_{_X}(t)\mathrm{d}tFY(y)=P{Y=g(X)⩽y}=P{X⩽g−1(y)}=FX[g−1(y)]=∫−∞g−1(y)fX(t)dt
f Y ( y ) = { F Y ′ ( y ) = [ ∫ − ∞ g − 1 ( y ) f X ( t ) d t ] ′ = f X [ g − 1 ( y ) ] ⋅ ∣ [ g − 1 ( y ) ] ′ ∣ , y ∈ I 0 , o t h e r w i s e f_{_Y}(y)=\left\{\begin{array}{l} \displaystyle F_{_Y}'(y)=\left[\int_{-\infty}^{g^{-1}(y)}f_{_X}(t)\mathrm{d}t\right]'=f_{_X}[g^{-1}(y)]\cdot|[g^{-1}(y)]'|\ ,y\in I\\ 0\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\ ,\ otherwise\\\end{array}\right.fY(y)=⎩⎨⎧FY′(y)=[∫−∞g−1(y)fX(t)dt]′=fX[g−1(y)]⋅∣[g−1(y)]′∣ ,y∈I0 , otherwise
Tips: 此处反函数仅展示运算过程,实际反函数未必存在。
4.3.3 混合型
不存在分布律和密度,仅求解 F Y ( y ) F_{_Y}(y)FY(y).
5 二维随机变量
5.1 分布函数
5.1.1 联合分布函数
定义:
F ( x , y ) = P { ( X ⩽ x ) ∩ ( Y ⩽ y ) } = P { X ⩽ x , Y ⩽ y } ( x , y ∈ R ) F(x,y)=P\{(X\leqslant x)\cap (Y\leqslant y)\}=P\{X\leqslant x,Y\leqslant y\}\ (x,y\in\mathbb{R})F(x,y)=P{(X⩽x)∩(Y⩽y)}=P{X⩽x,Y⩽y} (x,y∈R)
值域:
0 ⩽ F ( − ∞ , y ) F ( x , − ∞ ) F ( − ∞ , + ∞ ) ⩽ F ( x , y ) ⩽ F ( − ∞ , + ∞ ) = 1 0 \leqslant \begin{array}{l}F(-\infty,y)\\F(x,-\infty)\\F(-\infty,+\infty)\end{array} \leqslant F(x,y)\leqslant F(-\infty,+\infty)=10⩽F(−∞,y)F(x,−∞)F(−∞,+∞)⩽F(x,y)⩽F(−∞,+∞)=1
右连续性:
F ( x , y ) = F ( x + 0 , y ) = F ( x , y + 0 ) F(x,y)=F(x+0\ ,y)=F(x\ ,y+0)F(x,y)=F(x+0 ,y)=F(x ,y+0)
区域计算:
P { ( X , Y ) ∈ D } = P { x 1 < X ⩽ x 2 , y 1 < Y ⩽ y 2 } = F ( x 2 , y 2 ) − F ( x 2 , y 1 ) + F ( x 1 , y 1 ) − F ( x 1 , y 2 ) \begin{aligned} P\{(X,Y)\in D\}&=P\{x_1<X\leqslant x_2\ ,\ y_1<Y\leqslant y_2\} \\&=F(x_2,y_2)-F(x_2,y_1)+F(x_1,y_1)-F(x_1,y_2) \end{aligned}P{(X,Y)∈D}=P{x1<X⩽x2 , y1<Y⩽y2}=F(x2,y2)−F(x2,y1)+F(x1,y1)−F(x1,y2)
5.1.2 边缘分布函数
F X ( x ) = P { X ⩽ x } = P { X ⩽ x , Y ⩽ + ∞ } = F ( x , + ∞ ) F Y ( y ) = P { Y ⩽ y } = P { X ⩽ + ∞ , Y ⩽ y } = F ( + ∞ , y ) \begin{aligned} F_{_X}(x)&=P\{X\leqslant x\}=P\{X\leqslant x,Y\leqslant +\infty\}=F(x,+\infty) \\ \\F_{_Y}(y)&=P\{Y\leqslant y\}=P\{X\leqslant +\infty,Y\leqslant y\}=F(+\infty,y) \end{aligned}FX(x)FY(y)=P{X⩽x}=P{X⩽x,Y⩽+∞}=F(x,+∞)=P{Y⩽y}=P{X⩽+∞,Y⩽y}=F(+∞,y)
5.2 概率密度
5.2.1 离散型
5.2.1.1 联合概率分布
p i j ⩾ 0 y x \begin{matrix}p_{ij}\geqslant 0 & y\\ \quad & \quad\\ x& \quad\\ \end{matrix}pij⩾0xy | y 1 y_1y1 | y 2 y_2y2 | ⋯ \cdots⋯ | y n y_nyn | ⋯ \cdots⋯ | p i ⋅ = ∑ i = 1 ∞ p i j = 1 p_{i\cdot}=\sum\limits_{i=1}^{\infty}p_{ij}=1pi⋅=i=1∑∞pij=1 |
---|---|---|---|---|---|---|
x 1 x_1x1 | p 11 p_{11}p11 | p 12 p_{12}p12 | ⋯ \cdots⋯ | p 1 n p_{1n}p1n | ⋯ \cdots⋯ | p 1 ⋅ p_{1\cdot}p1⋅ |
x 2 x_2x2 | p 21 p_{21}p21 | p 22 p_{22}p22 | ⋯ \cdots⋯ | p 2 n p_{2n}p2n | ⋯ \cdots⋯ | p 2 ⋅ p_{2\cdot}p2⋅ |
⋮ \vdots⋮ | ⋮ \vdots⋮ | ⋮ \vdots⋮ | ⋮ \vdots⋮ | ⋮ \vdots⋮ | ||
x n x_nxn | p n 1 p_{n1}pn1 | p n 2 p_{n2}pn2 | ⋯ \cdots⋯ | p n m p_{nm}pnm | ⋯ \cdots⋯ | p m ⋅ p_{m\cdot}pm⋅ |
⋮ \vdots⋮ | ⋮ \vdots⋮ | ⋮ \vdots⋮ | ⋮ \vdots⋮ | ⋮ \vdots⋮ | ||
p ⋅ j = ∑ j = 1 ∞ p i j = 1 p_{\cdot j}=\sum\limits_{j=1}^{\infty}p_{ij}=1p⋅j=j=1∑∞pij=1 | p ⋅ 1 p_{\cdot 1}p⋅1 | p ⋅ 2 p_{\cdot 2}p⋅2 | ⋯ \cdots⋯ | p ⋅ n p_{\cdot n}p⋅n | ⋯ \cdots⋯ | ∑ i = 1 ∞ ∑ j = 1 ∞ p i j = 1 \sum\limits_{i=1}^{\infty}\sum\limits_{j=1}^{\infty}p_{ij}=1i=1∑∞j=1∑∞pij=1 |
5.2.1.2 条件概率分布
P { X = x i ∣ Y = y i } = P { X = x i , Y = y i } P { Y = y i } = p i j p ⋅ j P { Y = y i ∣ X = x i } = P { Y = y i , X = x i } P { X = x i } = p i j p i ⋅ P\{X=x_i|Y=y_i\}=\frac{P\{X=x_i,Y=y_i\}}{P\{Y=y_i\}}=\frac{p_{ij}}{p_{\cdot j}} \\ \\ \\P\{Y=y_i|X=x_i\}=\frac{P\{Y=y_i,X=x_i\}}{P\{X=x_i\}}=\frac{p_{ij}}{p_{i \cdot }}P{X=xi∣Y=yi}=P{Y=yi}P{X=xi,Y=yi}=p⋅jpijP{Y=yi∣X=xi}=P{X=xi}P{Y=yi,X=xi}=pi⋅pij
5.2.2 连续型
5.2.2.1 联合概率密度
定义:
F FF 为分布函数,∃ 0 < f ∈ R , s . t . F ( x , y ) = ∫ − ∞ x ∫ − ∞ y f ( u , v ) d u d v \displaystyle\exists\ 0<f\in R,\ \mathrm{s.t.}\ F(x,y)=\int_{-\infty}^x\int_{-\infty}^yf(u,v)\mathrm{d}u\mathrm{d}v∃ 0<f∈R, s.t. F(x,y)=∫−∞x∫−∞yf(u,v)dudv
性质:
- f ( x , y ) ⩾ 0 , ∫ − ∞ + ∞ ∫ − ∞ + ∞ f ( u , v ) d u d v = F ( + ∞ , + ∞ ) = 1 \displaystyle f(x,y)\geqslant 0,\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}f(u,v)\mathrm{d}u\mathrm{d}v=F(+\infty,+\infty)=1f(x,y)⩾0,∫−∞+∞∫−∞+∞f(u,v)dudv=F(+∞,+∞)=1
- P { ( X , Y ) ∈ G } = ∬ G f ( x , y ) d x d y \displaystyle P\{(X,Y)\in G\}=\iint_{_G}f(x,y)\mathrm{d}x\mathrm{d}yP{(X,Y)∈G}=∬Gf(x,y)dxdy
- f ( x , y ) f(x,y)f(x,y) 在 ( x , y ) (x,y)(x,y) 连续,则 ∂ 2 F ( x , y ) ∂ x ∂ y = f ( x , y ) \displaystyle \frac{\partial^2F(x,y)}{\partial x\partial y}=f(x,y)∂x∂y∂2F(x,y)=f(x,y)
5.2.2.2 边缘概率密度
F X ( x ) = ∫ − ∞ x [ ∫ − ∞ + ∞ f ( x , y ) d y ] d x ⇒ f X ( x ) = ∫ − ∞ + ∞ f ( x , y ) d y F Y ( y ) = ∫ − ∞ y [ ∫ − ∞ + ∞ f ( x , y ) d x ] d y ⇒ f Y ( y ) = ∫ − ∞ + ∞ f ( x , y ) d x F_{_X}(x)=\int_{-\infty}^x\left[\int_{-\infty}^{+\infty}f(x,y)\mathrm{d}y\right]\mathrm{d}x \ \Rightarrow\ f_{_X}(x)=\int_{-\infty}^{+\infty}f(x,y)\mathrm{d}y \\\quad \\F_{_Y}(y)=\int_{-\infty}^y\left[\int_{-\infty}^{+\infty}f(x,y)\mathrm{d}x\right]\mathrm{d}y \ \Rightarrow\ f_{_Y}(y)=\int_{-\infty}^{+\infty}f(x,y)\mathrm{d}xFX(x)=∫−∞x[∫−∞+∞f(x,y)dy]dx ⇒ fX(x)=∫−∞+∞f(x,y)dyFY(y)=∫−∞y[∫−∞+∞f(x,y)dx]dy ⇒ fY(y)=∫−∞+∞f(x,y)dx
5.2.2.2 条件概率密度
F Y ∣ X ( y ∣ x ) = ∫ − ∞ y f ( x , t ) f X ( x ) d t ⇒ f Y ∣ X ( y ∣ x ) = f ( x , y ) f X ( x ) , f X ( x ) > 0 F X ∣ Y ( x ∣ y ) = ∫ − ∞ x f ( t , y ) f Y ( y ) d t ⇒ f X ∣ Y ( x ∣ y ) = f ( x , y ) f Y ( y ) , f Y ( y ) > 0 \begin{aligned} F_{_{Y|X}}(y|x)=\int_{-\infty}^y\frac{f(x,t)}{f_{_X}(x)}\mathrm{d}t \ &\Rightarrow\ f_{_{Y|X}}(y|x)=\frac{f(x,y)}{f_{_X}(x)},\ f_{_X}(x)>0 \\ \\ F_{_{X|Y}}(x|y)=\int_{-\infty}^x\frac{f(t,y)}{f_{_Y}(y)}\mathrm{d}t \ &\Rightarrow\ f_{_{X|Y}}(x|y)=\frac{f(x,y)}{f_{_Y}(y)},\ f_{_Y}(y)>0 \end{aligned}FY∣X(y∣x)=∫−∞yfX(x)f(x,t)dt FX∣Y(x∣y)=∫−∞xfY(y)f(t,y)dt ⇒ fY∣X(y∣x)=fX(x)f(x,y), fX(x)>0⇒ fX∣Y(x∣y)=fY(y)f(x,y), fY(y)>0
注意:求解应确保作为条件的边缘概率为正,除去边缘概率为零的点.
5.3 独立
X , Y 互独 ⇔ d e f P { X ⩽ x , Y ⩽ y } = P { X ⩽ x } P { Y ⩽ y } ⇔ F ( x , y ) = F X ( x ) F Y ( y ) ⇔ 离散型 P { X = x i , Y = y i } = P { X = x i } P { Y = y i } ⇔ 连续型 f ( x , y ) = f X ( x ) f Y ( y ) ⇔ 条件密度 f X ∣ Y ( x ∣ y ) = f X ( x ) ⇒ ⇍ f ( X ) , g ( Y ) \begin{aligned} X,Y 互独 &\xLeftrightarrow{\mathrm{\ def\ }}\ P\{X\leqslant x,Y\leqslant y\}=P\{X\leqslant x\}P\{Y\leqslant y\} \\&\Leftrightarrow F(x,y)=F_{_X}(x)F_{_Y}(y) \\&\xLeftrightarrow{\ _{离散型}\ } P\{X=x_i,Y=y_i\}=P\{X=x_i\}P\{Y=y_i\} \\&\xLeftrightarrow{\ _{连续型}\ } f(x,y)=f_{_X}(x)f_{_Y}(y) \\&\xLeftrightarrow{\ _{条件密度}\ } f_{_{X|Y}}(x|y)=f_{_X}(x) \\&\begin{array}{l}\Rightarrow\\ \nLeftarrow\\\end{array}\ f(X)\ ,\ g(Y) \end{aligned}X,Y互独 def P{X⩽x,Y⩽y}=P{X⩽x}P{Y⩽y}⇔F(x,y)=FX(x)FY(y) 离散型 P{X=xi,Y=yi}=P{X=xi}P{Y=yi} 连续型 f(x,y)=fX(x)fY(y) 条件密度 fX∣Y(x∣y)=fX(x)⇒⇍ f(X) , g(Y)
5.4 r.v.函数
F Z ( z ) = P { g ( X , Y ) ⩽ z } = { ∑ g ( x i , y i ) ⩽ z p i j ∬ g ( x i , y i ) ⩽ z f ( x , y ) d x d y = X , Y 互独 { ∑ g ( x i , y i ) ⩽ z p i p j ∬ g ( x i , y i ) ⩽ z f X ( x ) f Y ( y ) d x d y F_{_Z}(z)=P\{g(X,Y)\leqslant z\}=\left\{\begin{array}{l}{\displaystyle \sum\limits_{g(x_i,y_i)\leqslant z}}p_{ij}\\ \\ {\displaystyle \iint\limits_{g(x_i,y_i)\leqslant z}f(x,y)\mathrm{d}x\mathrm{d}y}\\ \end{array}\right. \xlongequal{X,Y互独}\left\{\begin{array}{l} {\displaystyle \sum\limits_{g(x_i,y_i)\leqslant z}}p_ip_j\\ \\ {\displaystyle \iint\limits_{g(x_i,y_i)\leqslant z}f_{_X}(x)f_{_Y}(y)\mathrm{d}x\mathrm{d}y}\\ \end{array}\right.FZ(z)=P{g(X,Y)⩽z}=⎩⎨⎧g(xi,yi)⩽z∑pijg(xi,yi)⩽z∬f(x,y)dxdyX,Y互独⎩⎨⎧g(xi,yi)⩽z∑pipjg(xi,yi)⩽z∬fX(x)fY(y)dxdy
6 数字特征
6.1 期望
定义:
E ( X ) = ∑ i = 1 ∞ x i p i = ∫ − ∞ + ∞ x f ( x ) d x \displaystyle E(X)=\sum_{i=1}^{\infty}x_ip_i=\int_{-\infty}^{+\infty}xf(x)\mathrm{d}xE(X)=i=1∑∞xipi=∫−∞+∞xf(x)dx 绝对收敛
性质:
- E ( C ) = C ∈ R E(C)=C\in\mathbb{R}E(C)=C∈R
- E ( C X ) = C E ( X ) E(CX)=CE(X)E(CX)=CE(X)
- E ( X ± Y ) = E ( X ) ± E ( Y ) E(X\pm Y)=E(X)\pm E(Y)E(X±Y)=E(X)±E(Y)
- X , Y X,YX,Y 独立 ⇒ E ( X Y ) = E ( X ) E ( Y ) ⇔ ρ X Y = 0 \Rightarrow\ E(XY)=E(X)E(Y) \Leftrightarrow \rho_{XY}=0⇒ E(XY)=E(X)E(Y)⇔ρXY=0,即 X , Y X,YX,Y不相关
- E ( X 2 ) = D ( X ) + [ E ( X ) ] 2 ; E ( X n ) E(X^2)=D(X)+[E(X)]^2;E(X^n)E(X2)=D(X)+[E(X)]2;E(Xn) 则使用定义式,注意奇偶性化简
r.v.函数:
E ( Y ) = E [ g ( X ) ] = { ∑ i = 1 ∞ g ( x i ) p i ∫ − ∞ + ∞ g ( x ) f ( x ) d x E(Y)=E[g(X)]=\left\{\begin{array}{l} \displaystyle \sum\limits_{i=1}^{\infty}g(x_i)p_i\\ \\ \displaystyle \int_{-\infty}^{+\infty}g(x)f(x)\mathrm{d}x\\ \end{array}\right.E(Y)=E[g(X)]=⎩⎨⎧i=1∑∞g(xi)pi∫−∞+∞g(x)f(x)dx
E ( Z ) = E [ g ( X , Y ) ] = { ∑ j = 1 ∞ ∑ i = 1 ∞ g ( x i , y i ) p i j ∫ − ∞ + ∞ ∫ − ∞ + ∞ g ( x , y ) f ( x , y ) d x d y E(Z)=E[g(X,Y)]=\left\{\begin{array}{l} \displaystyle \sum\limits_{j=1}^{\infty}\sum\limits_{i=1}^{\infty}g(x_i,y_i)p_{ij}\\ \\ \displaystyle \int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}g(x,y)f(x,y)\mathrm{d}x\mathrm{d}y\\ \end{array}\right.E(Z)=E[g(X,Y)]=⎩⎨⎧j=1∑∞i=1∑∞g(xi,yi)pij∫−∞+∞∫−∞+∞g(x,y)f(x,y)dxdy
6.2 方差
定义:
D ( X ) = E { [ X − E ( X ) ] 2 } = E ( X 2 ) − [ E ( X ) ] 2 D(X)=E\{[X-E(X)]^2\}=E(X^2)-[E(X)]^2D(X)=E{[X−E(X)]2}=E(X2)−[E(X)]2
标准化r.v.:
E ( X − E X D X ) = 0 , D ( X − E X D X ) = 1 E\left(\frac{X-EX}{\sqrt{DX}}\right)=0\ ,\ D\left(\frac{X-EX}{\sqrt{DX}}\right)=1E(DXX−EX)=0 , D(DXX−EX)=1
性质:
- D ( C ) = 0 D(C)=0D(C)=0
- D ( C X ) = C 2 D ( X ) D(CX)=C^2D(X)D(CX)=C2D(X)
- D ( X ± Y ) = D ( X ) + D ( Y ) ± 2 C o v ( X , Y ) = 独立 ⇒ 不相关 D ( X ) + D ( Y ) D(X\pm Y)=D(X)+D(Y)\pm 2Cov(X,Y)\xlongequal{独立\Rightarrow 不相关}D(X)+D(Y)D(X±Y)=D(X)+D(Y)±2Cov(X,Y)独立⇒不相关D(X)+D(Y)
- D ( X ) = 0 ⇔ P { X = E ( X ) } = 1 D(X)=0\Leftrightarrow P\{X=E(X)\}=1D(X)=0⇔P{X=E(X)}=1
6.3 协方差
定义:
C o v ( X , Y ) = E [ ( X − E X ) ( Y − E Y ) ] = E ( X Y ) − E ( X ) E ( Y ) \mathrm{Cov}(X,Y)=E[(X-EX)(Y-EY)]=E(XY)-E(X)E(Y)Cov(X,Y)=E[(X−EX)(Y−EY)]=E(XY)−E(X)E(Y)
性质:
- C o v ( X , Y ) = C o v ( Y , X ) \mathrm{Cov}(X,Y)=\mathrm{Cov}(Y,X)Cov(X,Y)=Cov(Y,X)
- C o v ( X , X ) = D ( X ) \mathrm{Cov}(X,X)=D(X)Cov(X,X)=D(X)
D ( X ± Y ) = D ( X ) + D ( Y ) ± 2 C o v ( X , Y ) D(X\pm Y)=D(X)+D(Y)\pm 2\mathrm{Cov}(X,Y)D(X±Y)=D(X)+D(Y)±2Cov(X,Y)
D ( ∑ i = 1 n X i ) = ∑ i = 1 n D ( X i ) + ∑ 1 ⩽ i < j ⩽ n 2 C o v ( X i , X j ) \displaystyle D\left(\sum\limits_{i=1}^{n}X_i\right)=\sum\limits_{i=1}^{n}D(X_i)+\sum\limits_{1 \leqslant i < j \leqslant n}2\mathrm{Cov}(X_i,X_j)D(i=1∑nXi)=i=1∑nD(Xi)+1⩽i<j⩽n∑2Cov(Xi,Xj) - C o v ( a X , b Y ) = a b C o v ( X , Y ) \mathrm{Cov}(aX,bY)=ab\ \mathrm{Cov}(X,Y)Cov(aX,bY)=ab Cov(X,Y)
- C o v ( X 1 + X 2 , Y ) = C o v ( X 1 , Y ) + C o v ( X 2 , Y ) \mathrm{Cov}(X_1+X_2,Y)=\mathrm{Cov}(X_1,Y)+\mathrm{Cov}(X_2,Y)Cov(X1+X2,Y)=Cov(X1,Y)+Cov(X2,Y)
6.4 线性相关系数
定义:
ρ X Y = C o v ( X , Y ) D ( X ) D ( Y ) \rho_{_{XY}}=\frac{\mathrm{Cov}(X,Y)}{\sqrt{D(X)}\sqrt{D(Y)}}ρXY=D(X)D(Y)Cov(X,Y)
性质:
- ∣ ρ X Y ∣ ⩽ 1 |\rho_{_{XY}}|\leqslant 1∣ρXY∣⩽1
- ∣ ρ X Y ∣ = 1 ⇔ ∃ a , b ∈ R , s . t . P { Y = a X + b } = 1 |\rho_{_{XY}}|= 1 \Leftrightarrow \exists\ a,b\in\mathbb{R},\mathrm{s.t.}\ P\{Y=aX+b\}=1∣ρXY∣=1⇔∃ a,b∈R,s.t. P{Y=aX+b}=1,且 ρ X Y = { 1 , a > 0 − 1 , a < 0 \rho_{_{XY}}=\left\{\begin{array}{l}{\ \ 1\ ,a>0}\\{-1,a<0}\\\end{array}\right.ρXY={ 1 ,a>0−1,a<0
独立/不相关的结论:
- X , Y X,YX,Y 独立 ( P ( X Y ) = P ( X ) P ( Y ) ) ⇒ ⇍ X , Y (P(XY)=P(X)P(Y))\ \begin{array}{l}{\Rightarrow}\\{\nLeftarrow}\\\end{array} X,Y(P(XY)=P(X)P(Y)) ⇒⇍X,Y 不相关 ⇔ ρ X Y = 0 { ⇔ C o v ( X , Y ) = 0 ⇔ E ( X Y ) − E ( X ) E ( Y ) = 0 ⇔ D ( X ± Y ) = D ( X ) ± D ( Y ) \Leftrightarrow \rho_{_{XY}}=0\ \left\{\begin{array}{l}{\Leftrightarrow \mathrm{Cov}(X,Y)=0}\\{\Leftrightarrow E(XY)-E(X)E(Y)=0}\\{\Leftrightarrow D(X\pm Y)=D(X)\pm D(Y)}\\\end{array}\right.⇔ρXY=0 ⎩⎨⎧⇔Cov(X,Y)=0⇔E(XY)−E(X)E(Y)=0⇔D(X±Y)=D(X)±D(Y)
Tips: 不相关即二者无线性关系,但可能存在其他关系;独立即不存在任何关系.
- ( X , Y ) ∼ N ; X , Y (X,Y)\sim N;\ X,Y(X,Y)∼N; X,Y 独立 ⇔ X , Y \ \Leftrightarrow\ X,Y ⇔ X,Y 不相关
- X , Y ∼ B ( 1 , p ) ; X , Y X,Y\sim B(1,p);\ X,YX,Y∼B(1,p); X,Y 独立 ⇔ X , Y \ \Leftrightarrow\ X,Y ⇔ X,Y 不相关
6.5 矩
k kk 阶原点矩:E ( X k ) E(X^k)E(Xk)
k kk 阶中心矩:E { [ X − E ( X ) ] k } E\{[X-E(X)]^k\}E{[X−E(X)]k}
k + l k+lk+l 阶混合矩:E ( X k Y l ) , E [ ( X − E X ) k ( Y − E Y ) l ] E(X^kY^l),\ E[(X-EX)^k(Y-EY)^l]E(XkYl), E[(X−EX)k(Y−EY)l]
期望 E ( X ) E(X)E(X) 即为一阶原点矩,方差 D ( X ) = E { [ X − E ( X ) ] 2 } D(X)=E\{[X-E(X)]^2\}D(X)=E{[X−E(X)]2} 为二阶中心矩.
7 统计量
7.1 相关概念
总体: 所要研究问题有关个体的全体构成的集合.
样本: 按一定规定从总体中抽取的一部分个体.
抽到哪些个体是随机的,因而样本为随机变量。
一组样本由简单随机抽样得到,因而相互独立。
总体和样本都为随机变量,且同分布.
例: 研究全国人的年龄,总体为全国所有人的年龄,个体:每个人的年龄,随机变量可以抽取全国任何一个人的年龄,一组样本从中抽取n个人的年龄。
矩估计即由样本估计总体,采用替换原理,使用样本矩替换总体矩。
7.2 统计量
样本均值:
X ˉ = 1 n ∑ i = 1 n X i ⇒ { E ( X ˉ ) = μ D ( X ˉ ) = D ( X ) n = σ 2 n \bar{X}=\frac{1}{n}\sum\limits_{i=1}^{n}X_i \ \Rightarrow \left\{\begin{array}{l} E(\bar{X})=\mu\\ \\ \displaystyle D(\bar{X})=\frac{D(X)}{n}=\frac{\sigma^2}{n}\\ \end{array}\right.Xˉ=n1i=1∑nXi ⇒⎩⎨⎧E(Xˉ)=μD(Xˉ)=nD(X)=nσ2
样本方差:
S 2 = 1 n − 1 ∑ i = 1 n ( X i − X ˉ ) 2 = 1 n − 1 ( ∑ i = 1 n X i 2 − n X ˉ ) ⇒ E ( S 2 ) = D ( X ) = σ 2 S^2=\frac{1}{n-1}\sum\limits_{i=1}^{n}(X_i-\bar{X})^2=\frac{1}{n-1}\left(\sum\limits_{i=1}^{n}X_i^2-n\bar{X}\right) \Rightarrow E(S^2)=D(X)=\sigma^2S2=n−11i=1∑n(Xi−Xˉ)2=n−11(i=1∑nXi2−nXˉ)⇒E(S2)=D(X)=σ2
常用的计算数字特征替换:
∑ i = 1 n X i = n X ˉ \sum\limits_{i=1}^{n}X_i=n\bar{X}i=1∑nXi=nXˉ
( n − 1 ) S 2 = ∑ i = 1 n ( X i − X ˉ ) (n-1)S^2=\sum\limits_{i=1}^{n}(X_i-\bar{X})(n−1)S2=i=1∑n(Xi−Xˉ)
E ( χ 2 ( n ) ) = n , D ( χ 2 ( n ) ) = 2 n E(\chi^2(n))=n,D(\chi^2(n))=2nE(χ2(n))=n,D(χ2(n))=2n
正态分布总体:
- X ˉ \bar{X}Xˉ 与 S 2 S^2S2 相互独立,即 X ˉ \bar{X}Xˉ 与 ∑ i = 1 n ( X i − X ˉ ) 2 \sum\limits_{i=1}^n(X_i-\bar{X})^2i=1∑n(Xi−Xˉ)2 独立
⇒ { X ˉ 与 S 2 不相关, E ( X ˉ S 2 ) = E ( X ˉ ) E ( S 2 ) f ( X ˉ ) 与 g ( S 2 ) 相互独立 ⇒ f ( X ˉ ) 与 g ( S 2 ) 不相关 \Rightarrow \left\{\begin{array}{l} \bar{X} 与 S^2 不相关,E(\bar{X}S^2)=E(\bar{X})E(S^2)\\ \\ f(\bar{X})与g(S^2)相互独立\Rightarrow f(\bar{X})与g(S^2) 不相关\\ \end{array}\right.⇒⎩⎨⎧Xˉ与S2不相关,E(XˉS2)=E(Xˉ)E(S2)f(Xˉ)与g(S2)相互独立⇒f(Xˉ)与g(S2)不相关
( n − 1 ) S 2 σ 2 = ∑ i = 1 n ( X i − X ˉ σ ) 2 ∼ χ 2 ( n − 1 ) ⇒ D ( χ 2 ( n ) ) = 2 n D ( ( n − 1 ) S 2 σ 2 ) = 2 ( n − 1 ) ⇒ D ( S 2 ) = 2 σ 4 n − 1 \frac{(n-1)S^2}{\sigma^2}=\sum\limits_{i=1}^{n}\left(\frac{X_i-\bar{X}}{\sigma}\right)^2\sim \chi^2(n-1) \xRightarrow{D(\chi^2(n))=2n} D\left(\frac{(n-1)S^2}{\sigma^2}\right)=2(n-1)\Rightarrow D(S^2)=\frac{2\sigma^4}{n-1}σ2(n−1)S2=i=1∑n(σXi−Xˉ)2∼χ2(n−1)D(χ2(n))=2nD(σ2(n−1)S2)=2(n−1)⇒D(S2)=n−12σ4
X ˉ − μ σ / n ( n − 1 ) S 2 σ 2 / ( n − 1 ) = X ˉ − μ S / n ∼ t ( n − 1 ) \displaystyle \frac{\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}} {\sqrt{\frac{(n-1)S^2}{\sigma^2}/(n-1)}}=\frac{\bar{X}-\mu}{S/\sqrt{n}}\sim t(n-1)σ2(n−1)S2/(n−1)σ/nXˉ−μ=S/nXˉ−μ∼t(n−1)
X ˉ ∼ N ( μ , σ 2 n ) ⇒ X ˉ − μ σ / n ∼ N ( 0 , 1 ) ⇒ μ = 0 n X ˉ 2 σ 2 ∼ χ 2 ( 1 ) ⇒ D ( n X ˉ 2 σ 2 ) = 2 ⇒ D ( X ˉ 2 ) = 2 σ 4 n 2 \displaystyle \bar{X}\sim N(\mu,\frac{\sigma^2}{n})\Rightarrow \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\sim N(0,1) \xRightarrow{\mu=0} \frac{n\bar{X}^2}{\sigma^2}\sim \chi^2(1) \Rightarrow D\left(\frac{n\bar{X}^2}{\sigma^2}\right)=2 \Rightarrow D(\bar{X}^2)=\frac{2\sigma^4}{n^2}Xˉ∼N(μ,nσ2)⇒σ/nXˉ−μ∼N(0,1)μ=0σ2nXˉ2∼χ2(1)⇒D(σ2nXˉ2)=2⇒D(Xˉ2)=n22σ4
X i − μ σ ∼ N ( 0 , 1 ) ⇒ ∑ i = 1 n ( X i − μ σ ) 2 = 1 σ 2 ∑ i = 1 n ( X i − μ ) 2 ∼ χ 2 ( n ) ⇒ μ = 0 X 2 σ 2 ∼ χ 2 ( 1 ) ⇒ D ( X 2 σ 2 ) = 2 ⇒ D ( X 2 ) = 2 σ 2 \displaystyle \frac{X_i-\mu}{\sigma}\sim N(0,1)\Rightarrow \sum\limits_{i=1}^{n}\left(\frac{X_i-\mu}{\sigma}\right)^2=\frac{1}{\sigma^2}\sum\limits_{i=1}^{n}(X_i-\mu)^2\sim \chi^2(n)\xRightarrow{\mu=0} \frac{X^2}{\sigma^2}\sim \chi^2(1) \Rightarrow D\left(\frac{X^2}{\sigma^2}\right)=2\Rightarrow D(X^2)=2\sigma^2σXi−μ∼N(0,1)⇒i=1∑n(σXi−μ)2=σ21i=1∑n(Xi−μ)2∼χ2(n)μ=0σ2X2∼χ2(1)⇒D(σ2X2)=2⇒D(X2)=2σ2
Tips: 计算某量的平方方差,标准化转化卡方分布,降次.
8 参数估计
8.1 点估计
8.1.1 矩估计
8.1.1.1 样本矩与总体矩
样本矩:
k kk 阶原点矩:A k = 1 n ∑ i = 1 n X i k \displaystyle A_k=\frac{1}{n}\sum\limits_{i=1}^{n}X_i^kAk=n1i=1∑nXik
k kk 阶中心矩:B k = 1 n ∑ i = 1 n ( X i − X ˉ ) k \displaystyle B_k=\frac{1}{n}\sum\limits_{i=1}^{n}(X_i-\bar{X})^kBk=n1i=1∑n(Xi−Xˉ)k
总体矩:
k kk 阶原点矩:E ( X k ) E(X^k)E(Xk)
k kk 阶中心矩:E { [ X − E ( X ) ] k } E\{[X-E(X)]^k\}E{[X−E(X)]k}
8.1.1.2 替换原理
由样本矩代替总体矩,即 A k : = E ( X k ) , B k : = E { [ X − E ( X ) ] k } A_k:=E(X^k),B^k:=E\{[X-E(X)]^k\}Ak:=E(Xk),Bk:=E{[X−E(X)]k},联立方程求解待故参数,从而由样本估计总体.
8.1.1.3 参数方程联立
k kk 个待估参数 θ i \theta_iθi,联立 k kk 个方程,构建含参且易解方程。(不同方法估计结果可能不唯一)
8.1.1.3.1 单参估计
一阶原点矩方程:
1 n ∑ i = 1 n X i : = E ( X ) = g ( θ ) \frac{1}{n}\sum\limits_{i=1}^{n}X_i \ :=\ E(X)=g(\theta)n1i=1∑nXi := E(X)=g(θ)
若一阶矩计算 E ( X ) E(X)E(X) 后不含待估参数则使用二阶矩方程.
二阶原点矩方程:
1 n ∑ i = 1 n X i 2 : = E ( X 2 ) = g ( θ ) \frac{1}{n}\sum\limits_{i=1}^{n}X_i^2 \ :=\ E(X^2)=g(\theta)n1i=1∑nXi2 := E(X2)=g(θ)
8.1.1.3.2 双参估计
一阶与二阶原点矩联立方程组:
{ 1 n ∑ i = 1 n X i : = E ( X ) = g ( θ 1 , θ 2 ) 1 n ∑ i = 1 n X i 2 : = E ( X 2 ) = h ( θ 1 , θ 2 ) \left\{\begin{array}{l} \frac{1}{n}\sum\limits_{i=1}^{n}X_i \ :=\ E(X)=g(\theta_1,\theta_2)\\ \\ \frac{1}{n}\sum\limits_{i=1}^{n}X_i^2 \ :=\ E(X^2)=h(\theta_1,\theta_2)\\ \end{array}\right.⎩⎨⎧n1i=1∑nXi := E(X)=g(θ1,θ2)n1i=1∑nXi2 := E(X2)=h(θ1,θ2)
一阶原点矩与二阶中心矩联立方程组:
{ 1 n ∑ i = 1 n X i : = E ( X ) = g ( θ 1 , θ 2 ) 1 n ∑ i = 1 n ( X i − X ˉ ) 2 : = E { [ X − E X ] 2 } = h ( θ 1 , θ 2 ) \left\{\begin{array}{l} \frac{1}{n}\sum\limits_{i=1}^{n}X_i \ :=\ E(X)=g(\theta_1,\theta_2)\\ \\ \frac{1}{n}\sum\limits_{i=1}^{n}(X_i-\bar{X})^2 \ :=\ E\{[X-EX]^2\}=h(\theta_1,\theta_2)\\ \end{array}\right.⎩⎨⎧n1i=1∑nXi := E(X)=g(θ1,θ2)n1i=1∑n(Xi−Xˉ)2 := E{[X−EX]2}=h(θ1,θ2)
8.1.2 极大似然估计
Step1:
构造似然函数 L ( θ ) = L ( x 1 , ⋯ , x n ; θ 1 , ⋯ , θ k ) = ∏ i = 1 n f ( x i ; θ 1 , ⋯ , θ k ) L(\theta)=L(x_1,\cdots,x_n\ ;\theta_1,\cdots,\theta_k)=\prod\limits_{i=1}^{n}f(x_i\ ;\theta_1,\cdots,\theta_k)L(θ)=L(x1,⋯,xn ;θ1,⋯,θk)=i=1∏nf(xi ;θ1,⋯,θk)
Step2:
利用导数求极大值点
d L ( θ ) d θ j : = 0 ⇒ θ j ^ ( j = 1 , ⋯ , k ) \displaystyle \frac{\mathrm{d}L(\theta)}{\mathrm{d}\theta_j} :=0\ \Rightarrow\ \hat{\theta_j}\ (j=1,\cdots,k)dθjdL(θ):=0 ⇒ θj^ (j=1,⋯,k)
若直接求导不便,可将似然函数取对数 ln L ( θ ) = ∑ i = 1 n ln f ( x i ; θ 1 , ⋯ , θ k ) \displaystyle\ln{L(\theta)}=\sum\limits_{i=1}^n\ln{f(x_i;\theta_1,\cdots,\theta_k)}lnL(θ)=i=1∑nlnf(xi;θ1,⋯,θk) ,而后求导,
d ln L ( θ ) d θ j : = 0 ⇒ θ j ^ ( j = 1 , 2 , ⋯ , k ) \displaystyle \frac{\mathrm{d}\ln L(\theta)}{\mathrm{d}\theta_j} :=0\ \Rightarrow\ \hat{\theta_j}\ (j=1,2,\cdots,k)dθjdlnL(θ):=0 ⇒ θj^ (j=1,2,⋯,k)
Step3:
若有解,所得即所求;
若无解,则似然函数单调,估值应在边界点处取得,即 { ↑ : θ ^ = min { X i } ↓ : θ ^ = max { X i } \left\{\begin{array}{l} \uparrow\ :\ \hat{\theta}=\min\{X_i\}\\ \downarrow\ :\ \hat{\theta}=\max\{X_i\}\\ \end{array}\right.{↑ : θ^=min{Xi}↓ : θ^=max{Xi}
8.1.3 估计量评选标准
无偏性:
若 E ( θ ^ ) = θ E(\hat{\theta})=\thetaE(θ^)=θ,则 θ \thetaθ 为未知参数 θ \thetaθ 的无偏估计量.
有效性:
若 E ( θ 1 ^ ) = E ( θ 2 ^ ) = θ , D ( θ 1 ^ ) < D ( θ 2 ^ ) E(\hat{\theta_1})=E(\hat{\theta_2})=\theta,D(\hat{\theta_1})<D(\hat{\theta_2})E(θ1^)=E(θ2^)=θ,D(θ1^)<D(θ2^),则 θ 1 ^ \hat{\theta_1}θ1^ 比 θ 2 ^ \hat{\theta_2}θ2^ 更有效.
一致性(相合性):
θ n ^ = θ ^ ( X 1 , ⋯ , X n ) \hat{\theta_n}=\hat{\theta}(X_1,\cdots,X_n)θn^=θ^(X1,⋯,Xn) 依概率收敛于 θ \thetaθ,即 lim n → ∞ P { ∣ θ n ^ − θ ∣ < ε } = 1 \displaystyle\lim_{n\to\infty}P\{|\hat{\theta_n}-\theta|<\varepsilon\}=1n→∞limP{∣θn^−θ∣<ε}=1,则 θ ^ \hat{\theta}θ^ 为一致估计量.
Tips: 由辛钦大数定律 或 切比雪夫不等式判别.
8.1.4 大数定律与中心极限定理
Chebyshev 不等式:
{ ∃ E ( X i ) = μ ∃ D ( X i ) = σ 2 ⇒ ∀ ε > 0 { P { ∣ X − E ( X ) ∣ ⩾ ε } ⩽ D ( X ) ε 2 P { ∣ X − E ( X ) ∣ < ε } ⩾ 1 − D ( X ) ε 2 \left\{\begin{array}{l} \exists\ E(X_i)=\mu\\ \\ \exists\ D(X_i)=\sigma^2\\ \end{array}\right. \xRightarrow{\ \forall \varepsilon >0\ \ } \left\{\begin{array}{l} P\{|X-E(X)|\geqslant \varepsilon\}\leqslant \frac{D(X)}{\varepsilon^2}\\ \\ P\{|X-E(X)|< \varepsilon\}\geqslant 1-\frac{D(X)}{\varepsilon^2}\\ \end{array}\right.⎩⎨⎧∃ E(Xi)=μ∃ D(Xi)=σ2 ∀ε>0 ⎩⎨⎧P{∣X−E(X)∣⩾ε}⩽ε2D(X)P{∣X−E(X)∣<ε}⩾1−ε2D(X)
Khinchin 大数定律: 样本均值依概率收敛于期望.
{ X i I . I . D . ∃ E ( X i ) = μ ⇒ ∀ ε > 0 lim n → ∞ P { ∣ 1 n ∑ i = 1 n X i − μ ∣ < ε } = 1 \left\{\begin{array}{l} X_i \ \ I.I.D.\\ \exists\ E(X_i)=\mu\\ \end{array}\right. \xRightarrow{\ \forall \varepsilon >0\ \ } \lim_{n\to\infty}P\left\{\lvert\frac{1}{n}\sum_{i=1}^nX_i-\mu\rvert<\varepsilon\right\}=1{Xi I.I.D.∃ E(Xi)=μ ∀ε>0 n→∞limP{∣n1i=1∑nXi−μ∣<ε}=1
Lindberg-levi 中心极限定理: 样本均值依分布收敛于标准正态.
{ X i I . I . D . ∃ E ( X i ) = μ ∃ D ( X i ) = σ 2 ⇒ ∀ x ∈ R lim n → ∞ P { ∣ 1 n ∑ i = 1 n X i − μ σ n ∣ < x } = Φ ( x ) \left\{\begin{array}{l} X_i\ \ I.I.D.\\ \exists\ E(X_i)=\mu\\ \exists\ D(X_i)=\sigma^2 \end{array}\right. \xRightarrow{\ \forall x \in\mathbb{R}\ \ } \lim_{n\to\infty}P\left\{\lvert\frac{\frac{1}{n}\sum_{i=1}^nX_i-\mu}{\frac{\sigma}{\sqrt{n}}}\rvert<x\right\}=\Phi(x)⎩⎨⎧Xi I.I.D.∃ E(Xi)=μ∃ D(Xi)=σ2 ∀x∈R n→∞limP{∣nσn1∑i=1nXi−μ∣<x}=Φ(x)
8.2 区间估计
8.2.1 置信区间
θ \thetaθ 是总体的一个参数,参数空间为 Θ \ThetaΘ ,x k ( k = 1 , ⋯ , n ) x_k(k=1,\cdots,n)xk(k=1,⋯,n) 是来自该总体的样本,对于给定的 α ( 0 < α < 1 ) \alpha(0<\alpha<1)α(0<α<1) ,假设有两个统计量 θ ^ L = θ ^ L ( x 1 , ⋯ , x n ) , θ ^ U = θ ^ U ( x 1 , ⋯ , x n ) \hat{\theta}_L=\hat{\theta}_L(x_1,\cdots,x_n),\hat{\theta}_U=\hat{\theta}_U(x_1,\cdots,x_n)θ^L=θ^L(x1,⋯,xn),θ^U=θ^U(x1,⋯,xn) ,对于任意的 θ ∈ Θ \theta\in \Thetaθ∈Θ,有
P θ ( θ ^ L ⩽ θ ⩽ θ ^ U ) = 1 − α P_{\theta}\left(\hat{\theta}_L\leqslant \theta \leqslant\hat{\theta}_U\right)=1-\alphaPθ(θ^L⩽θ⩽θ^U)=1−α
则称 [ θ ^ L , θ ^ U ] [\hat{\theta}_L,\hat{\theta}_U][θ^L,θ^U] 为 θ \thetaθ 的置信度为 1 − α 1-\alpha1−α 的同等置信区间,α \alphaα 称显著性水平.
8.2.2 枢轴变量法
X ∼ N ( μ , σ 2 ) X\sim N(\mu,\sigma^2)X∼N(μ,σ2),求解参数 μ \muμ 一个置信度为 1 − α 1-\alpha1−α 的区间估计.
(1) σ 2 \sigma^2σ2 已知时 μ \muμ 的置信区间;
(2) σ 2 \sigma^2σ2 未知时 μ \muμ 的置信区间.
sol: (1)
Y = X ˉ − μ σ / n ∼ N ( 0 , 1 ) \displaystyle Y=\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\sim N(0,1)Y=σ/nXˉ−μ∼N(0,1)
1 − α = P { − u α 2 ⩽ Y = X ˉ − μ σ / n ⩽ u α 2 } = P { X ˉ − σ n u α 2 ⩽ μ ⩽ X ˉ + σ n u α 2 } \begin{aligned} 1-\alpha &=P\left\{-u_{\frac{\alpha}{2}}\leqslant Y=\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\leqslant u_{\frac{\alpha}{2}}\right\} \\&=P\left\{\bar{X}-\frac{\sigma}{\sqrt{n}}u_{\frac{\alpha}{2}}\leqslant \mu \leqslant \bar{X}+\frac{\sigma}{\sqrt{n}}u_{\frac{\alpha}{2}}\right\} \end{aligned}1−α=P{−u2α⩽Y=σ/nXˉ−μ⩽u2α}=P{Xˉ−nσu2α⩽μ⩽Xˉ+nσu2α}
(2)
Y = X ˉ − μ S / n ∼ t ( n − 1 ) \displaystyle Y=\frac{\bar{X}-\mu}{S/\sqrt{n}}\sim t(n-1)Y=S/nXˉ−μ∼t(n−1)
1 − α = P { − t α 2 ( n − 1 ) ⩽ Y = X ˉ − μ S / n ⩽ t α 2 ( n − 1 ) } = P { X ˉ − S n t α 2 ( n − 1 ) ⩽ μ ⩽ X ˉ + S n t α 2 ( n − 1 ) } \begin{aligned} 1-\alpha &=P\left\{-t_{\frac{\alpha}{2}}(n-1) \leqslant Y=\frac{\bar{X}-\mu}{S/\sqrt{n}}\leqslant t_{\frac{\alpha}{2}}(n-1)\right\} \\&=P\left\{\bar{X}-\frac{S}{\sqrt{n}}t_{\frac{\alpha}{2}}(n-1)\leqslant \mu \leqslant \bar{X}+\frac{S}{\sqrt{n}}t_{\frac{\alpha}{2}}(n-1)\right\} \end{aligned}1−α=P{−t2α(n−1)⩽Y=S/nXˉ−μ⩽t2α(n−1)}=P{Xˉ−nSt2α(n−1)⩽μ⩽Xˉ+nSt2α(n−1)}