概率论与数理统计-下篇

概率论与数理统计-下篇

4 一维随机变量

4.1 分布函数

4.1.1 定义与充要条件

定义:

F X ( x ) = P { X ⩽ x }   ,   x ∈ R F_{_{X}}(x)=P\{X\leqslant x\}\ ,\ x\in\mathbb{R}FX(x)=P{Xx} , xR

充要条件:

  1. 值域:0 = F ( − ∞ ) ⩽ F ( x ) ⩽ F ( + ∞ ) = 1 0=F(-\infty)\leqslant F(x) \leqslant F(+\infty)=10=F()F(x)F(+)=1

  2. 单调不减:∀ x 1 < x 2   ,   F ( x 1 ) ⩽ F ( x 2 ) \forall x_1<x_2\ ,\ F(x_1)\leqslant F(x_2)x1<x2 , F(x1)F(x2)

  3. 右连续性:F ( x ) = F ( x + 0 ) F(x)=F(x+0)F(x)=F(x+0)

4.1.2 概率计算

∀ x ∈ R ,   P { X = x 0 } = F ( x 0 ) − F ( x 0 − 0 ) = F ∈ C ( U ( x 0 , δ ) ) 0 \forall x\in\mathbb{R},\ P\{X=x_0\}=F(x_0)-F(x_0-0)\xlongequal{F\in C(U(x_0,\delta))}0xR, P{X=x0}=F(x0)F(x00)FC(U(x0,δ))0

∀ x 1 < x 2 ,   P { x 1 < X ⩽ x 2 } = P { X ⩽ x 2 } − P { X ⩽ x 1 } = F ( x 2 ) − F ( x 1 ) \forall x_1<x_2,\ P\{x_1<X\leqslant x_2\}=P\{X\leqslant x_2\}-P\{X\leqslant x_1\}=F(x_2)-F(x_1)x1<x2, P{x1<Xx2}=P{Xx2}P{Xx1}=F(x2)F(x1)

4.1.3 分布函数判别

  1. F ( a x + b ) , a > 0 F(ax+b),a>0F(ax+b),a>0 仍为分布函数,a < 0 a<0a<0 不是分布函数.

  2. a 1 F ( x ) + a 2 F ( x ) , a 1 + a 2 = 1 , a i ⩾ 0 a_1F(x)+a_2F(x),a_1+a_2=1,a_i\geqslant 0a1F(x)+a2F(x),a1+a2=1,ai0 仍为分布函数.

  3. F 1 ( x ) F 2 ( x )   ,   1 − [ 1 − F 1 ( x ) ] [ 1 − F 2 ( x ) ] F_1(x)F_2(x)\ ,\ 1-[1-F_1(x)][1-F_2(x)]F1(x)F2(x) , 1[1F1(x)][1F2(x)] 仍为分布函数.

4.1.4 最值函数

最大值:
F Z ( z ) = P { max ⁡ ( X , Y ) ⩽ z } = P { X ⩽ z , Y ⩽ z } = X , Y 互独 P { X ⩽ z } P { Y ⩽ z } = F X ( z ) F Y ( z ) = X , Y 同分布 [ F X ( z ) ] 2 \begin{aligned} F_{_Z}(z)&=P\{\max(X,Y)\leqslant z\}=P\{X\leqslant z,Y\leqslant z\} \\&\xlongequal{X,Y互独}P\{X\leqslant z\}P\{Y\leqslant z\} =F_{_X}(z)F_{_Y}(z) \\&\xlongequal{X,Y同分布}[F_{_X}(z)]^2 \end{aligned}FZ(z)=P{max(X,Y)z}=P{Xz,Yz}X,Y互独P{Xz}P{Yz}=FX(z)FY(z)X,Y同分布[FX(z)]2

U = max ⁡ { X 1 , X 2 , ⋯ , X n }   ,   F U ( x ) = X i 独立同分布 [ F X ( x ) ] n U=\max\{X_1,X_2,\cdots,X_n\}\ ,\ F_{_U}(x)\xlongequal{X_i独立同分布}[F_{_X}(x)]^nU=max{X1,X2,,Xn} , FU(x)Xi独立同分布[FX(x)]n

最小值:
F Z ( z ) = P { min ⁡ ( X , Y ) ⩽ z } = P { ( X ⩽ z ) ∪ ( Y ⩽ z ) } = X , Y 互独 P { X ⩽ z } + P { Y ⩽ z } − P { X ⩽ z , Y ⩽ z } = F X ( z ) + F Y ( z ) − F ( z , z ) = X , Y 互独 F X ( z ) + F Y ( z ) − F X ( z ) F Y ( z ) \begin{aligned} F_Z(z)&=P\{\min(X,Y)\leqslant z\}=P\{(X\leqslant z)\cup (Y\leqslant z)\} \\&\xlongequal{X,Y互独}P\{X\leqslant z\}+P\{Y\leqslant z\}-P\{X\leqslant z,Y\leqslant z\} \\&=F_X(z)+F_Y(z)-F(z,z) \\&\xlongequal{X,Y互独}F_X(z)+F_Y(z)-F_X(z)F_Y(z) \end{aligned}FZ(z)=P{min(X,Y)z}=P{(Xz)(Yz)}X,Y互独P{Xz}+P{Yz}P{Xz,Yz}=FX(z)+FY(z)F(z,z)X,Y互独FX(z)+FY(z)FX(z)FY(z)

F Z ( z ) = P { min ⁡ ( X , Y ) ⩽ z } = 1 − P { X > z , Y > z } = X , Y 互独 1 − P { X > z } P { Y > z } = 1 − [ 1 − F X ( z ) ] [ 1 − F Y ( z ) ] = X , Y 互独 1 − [ 1 − F X ( z ) ] 2 \begin{aligned} F_{_Z}(z)&=P\{\min(X,Y)\leqslant z\}=1-P\{X>z,Y>z\} \\&\xlongequal{X,Y互独}1-P\{X>z\}P\{Y>z\} \\&=1-[1-F_{_X}(z)][1-F_{_Y}(z)]\xlongequal{X,Y互独}1-[1-F_{_X}(z)]^2 \end{aligned}FZ(z)=P{min(X,Y)z}=1P{X>z,Y>z}X,Y互独1P{X>z}P{Y>z}=1[1FX(z)][1FY(z)]X,Y互独1[1FX(z)]2

V = max ⁡ { X 1 , X 2 , ⋯ , X n }   , F V ( x ) = X i  独立同分布 1 − [ 1 − F X ( x ) ] n V=\max\{X_1,X_2,\cdots,X_n\}\ ,F_{_V}(x)\xlongequal{X_i\ 独立同分布}1-[1-F_{_X}(x)]^nV=max{X1,X2,,Xn} ,FV(x)Xi 独立同分布1[1FX(x)]n

4.2 随机变量

4.2.1 离散型

4.2.1.1 分布律及充要条件

X ∼ P { X = x i } = p i   ( i ∈ N + )   ⇔ { 非负性: p i ⩾ 0 正则性 ( 规范性 ) : ∑ i = 1 ∞ p i = 1 X\sim P\{X=x_i\}=p_i\ (i\in\mathbb{N}^+)\ \Leftrightarrow \left\{\begin{array}{l} 非负性:p_i\geqslant 0\\ \displaystyle 正则性(规范性):\sum\limits_{i=1}^{\infty}p_i=1\\ \end{array}\right.XP{X=xi}=pi (iN+) 非负性:pi0正则性(规范性)i=1pi=1

4.2.1.2 分布函数

F ( x ) = P { X ⩽ x } = ∑ x i ⩽ x P { X = x i } F(x)=P\{X\leqslant x\}=\sum_{x_i\leqslant x}P\{X=x_i\}F(x)=P{Xx}=xixP{X=xi}

4.2.2 连续型

4.2.2.1 概率密度

充要条件:
X ∼ f X ( x ) ⇔ { 非负性 : f ( x ) ⩾ 0 正则性 ( 规范性 ) : ∫ − ∞ + ∞ f ( x ) d x = 1 X\sim f_{_X}(x) \Leftrightarrow \left\{\begin{array}{l} 非负性: f(x) \geqslant 0\\ 正则性(规范性): \displaystyle\int_{-\infty}^{+\infty}f(x)\mathrm{d}x=1\\ \end{array}\right.XfX(x)非负性:f(x)0正则性(规范性):+f(x)dx=1

性质:

  1. ∀ x 1 < x 2   , P { x 1 < X ⩽ x 2 } = F ( x 2 ) − F ( x 1 ) = ∫ x 1 x 2 f ( t ) d t \displaystyle \forall x_1<x_2\ ,P\{x_1<X\leqslant x_2\}=F(x_2)-F(x_1)=\int_{x_1}^{x_2}f(t)\mathrm{d}tx1<x2 ,P{x1<Xx2}=F(x2)F(x1)=x1x2f(t)dt

  2. 有限个点不影响区间面积,即概率,故
    (1) P { X = x 0 } = F ( x 0 ) − F ( x 0 − 0 ) = F 在 x 0 处连续 0 P\{X=x_0\}=F(x_0)-F(x_0-0)\xlongequal{F在x_0处连续} 0P{X=x0}=F(x0)F(x00)Fx0处连续0
    (2) P { a < X < b } = P { a ⩽ x ⩽ b } = P { a < x ⩽ b } = P { a ⩽ x < b } P\{a<X<b\}=P\{a\leqslant x\leqslant b\}=P\{a<x\leqslant b\}=P\{a\leqslant x<b\}P{a<X<b}=P{axb}=P{a<xb}=P{ax<b}

  3. f ( x ) f(x)f(x)x xx 处连续,则 F ′ ( x ) = f ( x ) F'(x)=f(x)F(x)=f(x)

  4. (1) F ( x ) F(x)F(x) 必连续,f ( x ) f(x)f(x) 未必(函数连续可积性质)
    (2) F ( x ) F(x)F(x) 不连续则非连续型,无概率密度.
    例如:F ( x ) = { 0 , x < 0 x / 2 , 0 ⩽ x < 1 1 , x ⩾ 1 , P { X = 1 } = F ( 1 ) − F ( 1 − 0 ) = 1 2 \displaystyle F(x)=\left\{ \begin{array}{l}0\quad,x<0\\ x/2,0\leqslant x<1\\1\quad,x\geqslant 1\\\end{array}\right.,P\{X=1\}=F(1)-F(1-0)=\frac{1}{2}F(x)=0,x<0x/2,0x<11,x1,P{X=1}=F(1)F(10)=21 为混合型.

概率密度判别:

  1. a f ( a x + b ) af(ax+b)af(ax+b) 仍为密度,因为 ∫ − ∞ + ∞ f ( a x + b ) d ( a x + b ) = 1 \displaystyle \int_{-\infty}^{+\infty}f(ax+b)\mathrm{d}(ax+b)=1+f(ax+b)d(ax+b)=1.

  2. a 1 f 1 ( x ) + a 2 f ( x 2 ) , a 1 + a 2 = 1 , a i ⩾ 0 a_1f_1(x)+a_2f(x_2),a_1+a_2=1,a_i \geqslant 0a1f1(x)+a2f(x2),a1+a2=1,ai0 仍为密度 .

  3. f 1 ( x ) ⋅ f 2 ( x ) f_1(x)\cdot f_2(x)f1(x)f2(x) 未必是密度.

  4. f 1 ( x ) F 1 ( x ) + f 2 ( x ) F 2 ( x ) = d [ F 1 ( x ) F 2 ( x ) ] d x \displaystyle f_1(x)F_1(x)+f_2(x)F_2(x)=\frac{\mathrm{d}[F_1(x)F_2(x)]}{\mathrm{d}x}f1(x)F1(x)+f2(x)F2(x)=dxd[F1(x)F2(x)] 仍为密度.

4.2.2.2 分布函数

F ( x ) = P { X ⩽ x } = ∫ − ∞ x f ( t ) d t , f ( x ) ⩾ 0   ; P { X < x } = F ( x − 0 ) F(x)=P\{X\leqslant x\}=\int_{-\infty}^{x}f(t)\mathrm{d}t,f(x)\geqslant 0\ ; \\P\{X<x\}=F(x-0)F(x)=P{Xx}=xf(t)dt,f(x)0 ;P{X<x}=F(x0)

4.2.2.3 推论

X ∼ F X ( x ) ∈ C ( R )   ⇒   Y = F X ( X ) ∼ U ( 0 , 1 ) X\sim F_{_X}(x)\in C(\mathbb{R})\ \Rightarrow\ Y=F_{_X}(X)\sim U(0,1)XFX(x)C(R)  Y=FX(X)U(0,1)

proof:
∀   y ∈ [ 0 , 1 ] , \forall\ y\in [0,1], y[0,1],
P { Y = F X ( X ) ⩽ y } = P { X ⩽ F X − 1 ( y ) } = F X [ F X − 1 ( y ) ] = y P\{Y=F_{_X}(X)\leqslant y\}=P\{X \leqslant F_{_X}^{-1}(y)\}=F_{_X}[F_{_X}^{-1}(y)]=yP{Y=FX(X)y}=P{XFX1(y)}=FX[FX1(y)]=y

Y = F X ( X ) ∼ U ( 0 , 1 ) Y=F_{_X}(X)\sim U(0,1)Y=FX(X)U(0,1)

4.3 Y=g(X) 概率分布

4.3.1 离散型

P { X = x i } = p i   ⇒   Y = g ( X )     P { Y = g ( x i ) } = p i P\{X=x_i\}=p_i \ \xRightarrow{\ Y=g(X)\ }\ P\{Y=g(x_i)\}=p_iP{X=xi}=pi  Y=g(X)  P{Y=g(xi)}=pi

4.3.2 连续型

mathod1:(图像)

Y = g ( X ) ⊂ ( α , β ) Y=g(X)\subset (\alpha,\beta)Y=g(X)(α,β),由 Y = y Y=yY=y 确定 g ( X ) ⩽ y g(X)\leqslant yg(X)yX XX 范围,

  1. y < α , F Y ( y ) = 0 y<\alpha,F_{_Y}(y)=0y<α,FY(y)=0
  2. y ⩾ β , F Y ( y ) = 1 y\geqslant \beta,F_{_Y}(y)=1yβ,FY(y)=1
  3. α ⩽ y < β , P { g ( X ) ⩽ y } = P { φ 1 ( y ) ⩽ X ⩽ φ 2 ( y ) } = ∫ φ 1 ( y ) φ 2 ( y ) f X ( t ) d t ⇒ { 可积:直接计算分布函数 不可积:变限积分求导,求解概率密度 \displaystyle\alpha\leqslant y< \beta,P\{g(X)\leqslant y\}=P\{\varphi_1(y)\leqslant X\leqslant \varphi_2(y)\}=\int_{\varphi_1(y)}^{\varphi_2(y)}f_{_X}(t)\mathrm{d}t \\ \Rightarrow \left\{\begin{array}{l}可积:直接计算分布函数\\ 不可积:变限积分求导,求解概率密度\end{array}\right.αy<β,P{g(X)y}=P{φ1(y)Xφ2(y)}=φ1(y)φ2(y)fX(t)dt{可积:直接计算分布函数不可积:变限积分求导,求解概率密度

mathod2:(公式)
F Y ( y ) = P { Y = g ( X ) ⩽ y } = P { X ⩽ g − 1 ( y ) } = F X [ g − 1 ( y ) ] = ∫ − ∞ g − 1 ( y ) f X ( t ) d t F_{_Y}(y)=P\{Y=g(X)\leqslant y\}=P\{X\leqslant g^{-1}(y)\}=F_{_X}[g^{-1}(y)]=\int_{-\infty}^{g^{-1}(y)}f_{_X}(t)\mathrm{d}tFY(y)=P{Y=g(X)y}=P{Xg1(y)}=FX[g1(y)]=g1(y)fX(t)dt

f Y ( y ) = { F Y ′ ( y ) = [ ∫ − ∞ g − 1 ( y ) f X ( t ) d t ] ′ = f X [ g − 1 ( y ) ] ⋅ ∣ [ g − 1 ( y ) ] ′ ∣   , y ∈ I 0   ,   o t h e r w i s e f_{_Y}(y)=\left\{\begin{array}{l} \displaystyle F_{_Y}'(y)=\left[\int_{-\infty}^{g^{-1}(y)}f_{_X}(t)\mathrm{d}t\right]'=f_{_X}[g^{-1}(y)]\cdot|[g^{-1}(y)]'|\ ,y\in I\\ 0\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\ ,\ otherwise\\\end{array}\right.fY(y)=FY(y)=[g1(y)fX(t)dt]=fX[g1(y)][g1(y)] ,yI0 , otherwise
Tips: 此处反函数仅展示运算过程,实际反函数未必存在。

4.3.3 混合型

不存在分布律和密度,仅求解 F Y ( y ) F_{_Y}(y)FY(y).

5 二维随机变量

5.1 分布函数

5.1.1 联合分布函数

定义:
F ( x , y ) = P { ( X ⩽ x ) ∩ ( Y ⩽ y ) } = P { X ⩽ x , Y ⩽ y }   ( x , y ∈ R ) F(x,y)=P\{(X\leqslant x)\cap (Y\leqslant y)\}=P\{X\leqslant x,Y\leqslant y\}\ (x,y\in\mathbb{R})F(x,y)=P{(Xx)(Yy)}=P{Xx,Yy} (x,yR)

值域:
0 ⩽ F ( − ∞ , y ) F ( x , − ∞ ) F ( − ∞ , + ∞ ) ⩽ F ( x , y ) ⩽ F ( − ∞ , + ∞ ) = 1 0 \leqslant \begin{array}{l}F(-\infty,y)\\F(x,-\infty)\\F(-\infty,+\infty)\end{array} \leqslant F(x,y)\leqslant F(-\infty,+\infty)=10F(,y)F(x,)F(,+)F(x,y)F(,+)=1

右连续性:
F ( x , y ) = F ( x + 0   , y ) = F ( x   , y + 0 ) F(x,y)=F(x+0\ ,y)=F(x\ ,y+0)F(x,y)=F(x+0 ,y)=F(x ,y+0)

区域计算:
P { ( X , Y ) ∈ D } = P { x 1 < X ⩽ x 2   ,   y 1 < Y ⩽ y 2 } = F ( x 2 , y 2 ) − F ( x 2 , y 1 ) + F ( x 1 , y 1 ) − F ( x 1 , y 2 ) \begin{aligned} P\{(X,Y)\in D\}&=P\{x_1<X\leqslant x_2\ ,\ y_1<Y\leqslant y_2\} \\&=F(x_2,y_2)-F(x_2,y_1)+F(x_1,y_1)-F(x_1,y_2) \end{aligned}P{(X,Y)D}=P{x1<Xx2 , y1<Yy2}=F(x2,y2)F(x2,y1)+F(x1,y1)F(x1,y2)

5.1.2 边缘分布函数

F X ( x ) = P { X ⩽ x } = P { X ⩽ x , Y ⩽ + ∞ } = F ( x , + ∞ ) F Y ( y ) = P { Y ⩽ y } = P { X ⩽ + ∞ , Y ⩽ y } = F ( + ∞ , y ) \begin{aligned} F_{_X}(x)&=P\{X\leqslant x\}=P\{X\leqslant x,Y\leqslant +\infty\}=F(x,+\infty) \\ \\F_{_Y}(y)&=P\{Y\leqslant y\}=P\{X\leqslant +\infty,Y\leqslant y\}=F(+\infty,y) \end{aligned}FX(x)FY(y)=P{Xx}=P{Xx,Y+}=F(x,+)=P{Yy}=P{X+,Yy}=F(+,y)

5.2 概率密度

5.2.1 离散型

5.2.1.1 联合概率分布
p i j ⩾ 0 y x \begin{matrix}p_{ij}\geqslant 0 & y\\ \quad & \quad\\ x& \quad\\ \end{matrix}pij0xyy 1 y_1y1y 2 y_2y2⋯ \cdotsy n y_nyn⋯ \cdotsp i ⋅ = ∑ i = 1 ∞ p i j = 1 p_{i\cdot}=\sum\limits_{i=1}^{\infty}p_{ij}=1pi=i=1pij=1
x 1 x_1x1p 11 p_{11}p11p 12 p_{12}p12⋯ \cdotsp 1 n p_{1n}p1n⋯ \cdotsp 1 ⋅ p_{1\cdot}p1
x 2 x_2x2p 21 p_{21}p21p 22 p_{22}p22⋯ \cdotsp 2 n p_{2n}p2n⋯ \cdotsp 2 ⋅ p_{2\cdot}p2
⋮ \vdots⋮ \vdots⋮ \vdots⋮ \vdots⋮ \vdots
x n x_nxnp n 1 p_{n1}pn1p n 2 p_{n2}pn2⋯ \cdotsp n m p_{nm}pnm⋯ \cdotsp m ⋅ p_{m\cdot}pm
⋮ \vdots⋮ \vdots⋮ \vdots⋮ \vdots⋮ \vdots
p ⋅ j = ∑ j = 1 ∞ p i j = 1 p_{\cdot j}=\sum\limits_{j=1}^{\infty}p_{ij}=1pj=j=1pij=1p ⋅ 1 p_{\cdot 1}p1p ⋅ 2 p_{\cdot 2}p2⋯ \cdotsp ⋅ n p_{\cdot n}pn⋯ \cdots∑ i = 1 ∞ ∑ j = 1 ∞ p i j = 1 \sum\limits_{i=1}^{\infty}\sum\limits_{j=1}^{\infty}p_{ij}=1i=1j=1pij=1
5.2.1.2 条件概率分布

P { X = x i ∣ Y = y i } = P { X = x i , Y = y i } P { Y = y i } = p i j p ⋅ j P { Y = y i ∣ X = x i } = P { Y = y i , X = x i } P { X = x i } = p i j p i ⋅ P\{X=x_i|Y=y_i\}=\frac{P\{X=x_i,Y=y_i\}}{P\{Y=y_i\}}=\frac{p_{ij}}{p_{\cdot j}} \\ \\ \\P\{Y=y_i|X=x_i\}=\frac{P\{Y=y_i,X=x_i\}}{P\{X=x_i\}}=\frac{p_{ij}}{p_{i \cdot }}P{X=xiY=yi}=P{Y=yi}P{X=xi,Y=yi}=pjpijP{Y=yiX=xi}=P{X=xi}P{Y=yi,X=xi}=pipij

5.2.2 连续型

5.2.2.1 联合概率密度

定义:
F FF 为分布函数,∃   0 < f ∈ R ,   s . t .   F ( x , y ) = ∫ − ∞ x ∫ − ∞ y f ( u , v ) d u d v \displaystyle\exists\ 0<f\in R,\ \mathrm{s.t.}\ F(x,y)=\int_{-\infty}^x\int_{-\infty}^yf(u,v)\mathrm{d}u\mathrm{d}v 0<fR, s.t. F(x,y)=xyf(u,v)dudv

性质:

  1. f ( x , y ) ⩾ 0 , ∫ − ∞ + ∞ ∫ − ∞ + ∞ f ( u , v ) d u d v = F ( + ∞ , + ∞ ) = 1 \displaystyle f(x,y)\geqslant 0,\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}f(u,v)\mathrm{d}u\mathrm{d}v=F(+\infty,+\infty)=1f(x,y)0,++f(u,v)dudv=F(+,+)=1
  2. P { ( X , Y ) ∈ G } = ∬ G f ( x , y ) d x d y \displaystyle P\{(X,Y)\in G\}=\iint_{_G}f(x,y)\mathrm{d}x\mathrm{d}yP{(X,Y)G}=Gf(x,y)dxdy
  3. f ( x , y ) f(x,y)f(x,y)( x , y ) (x,y)(x,y) 连续,则 ∂ 2 F ( x , y ) ∂ x ∂ y = f ( x , y ) \displaystyle \frac{\partial^2F(x,y)}{\partial x\partial y}=f(x,y)xy2F(x,y)=f(x,y)
5.2.2.2 边缘概率密度

F X ( x ) = ∫ − ∞ x [ ∫ − ∞ + ∞ f ( x , y ) d y ] d x   ⇒   f X ( x ) = ∫ − ∞ + ∞ f ( x , y ) d y F Y ( y ) = ∫ − ∞ y [ ∫ − ∞ + ∞ f ( x , y ) d x ] d y   ⇒   f Y ( y ) = ∫ − ∞ + ∞ f ( x , y ) d x F_{_X}(x)=\int_{-\infty}^x\left[\int_{-\infty}^{+\infty}f(x,y)\mathrm{d}y\right]\mathrm{d}x \ \Rightarrow\ f_{_X}(x)=\int_{-\infty}^{+\infty}f(x,y)\mathrm{d}y \\\quad \\F_{_Y}(y)=\int_{-\infty}^y\left[\int_{-\infty}^{+\infty}f(x,y)\mathrm{d}x\right]\mathrm{d}y \ \Rightarrow\ f_{_Y}(y)=\int_{-\infty}^{+\infty}f(x,y)\mathrm{d}xFX(x)=x[+f(x,y)dy]dx  fX(x)=+f(x,y)dyFY(y)=y[+f(x,y)dx]dy  fY(y)=+f(x,y)dx

5.2.2.2 条件概率密度

F Y ∣ X ( y ∣ x ) = ∫ − ∞ y f ( x , t ) f X ( x ) d t   ⇒   f Y ∣ X ( y ∣ x ) = f ( x , y ) f X ( x ) ,   f X ( x ) > 0 F X ∣ Y ( x ∣ y ) = ∫ − ∞ x f ( t , y ) f Y ( y ) d t   ⇒   f X ∣ Y ( x ∣ y ) = f ( x , y ) f Y ( y ) ,   f Y ( y ) > 0 \begin{aligned} F_{_{Y|X}}(y|x)=\int_{-\infty}^y\frac{f(x,t)}{f_{_X}(x)}\mathrm{d}t \ &\Rightarrow\ f_{_{Y|X}}(y|x)=\frac{f(x,y)}{f_{_X}(x)},\ f_{_X}(x)>0 \\ \\ F_{_{X|Y}}(x|y)=\int_{-\infty}^x\frac{f(t,y)}{f_{_Y}(y)}\mathrm{d}t \ &\Rightarrow\ f_{_{X|Y}}(x|y)=\frac{f(x,y)}{f_{_Y}(y)},\ f_{_Y}(y)>0 \end{aligned}FYX(yx)=yfX(x)f(x,t)dt FXY(xy)=xfY(y)f(t,y)dt  fYX(yx)=fX(x)f(x,y), fX(x)>0 fXY(xy)=fY(y)f(x,y), fY(y)>0

注意:求解应确保作为条件的边缘概率为正,除去边缘概率为零的点.

5.3 独立

X , Y 互独 ⇔   d e f     P { X ⩽ x , Y ⩽ y } = P { X ⩽ x } P { Y ⩽ y } ⇔ F ( x , y ) = F X ( x ) F Y ( y ) ⇔   离散型   P { X = x i , Y = y i } = P { X = x i } P { Y = y i } ⇔   连续型   f ( x , y ) = f X ( x ) f Y ( y ) ⇔   条件密度   f X ∣ Y ( x ∣ y ) = f X ( x ) ⇒ ⇍   f ( X )   ,   g ( Y ) \begin{aligned} X,Y 互独 &\xLeftrightarrow{\mathrm{\ def\ }}\ P\{X\leqslant x,Y\leqslant y\}=P\{X\leqslant x\}P\{Y\leqslant y\} \\&\Leftrightarrow F(x,y)=F_{_X}(x)F_{_Y}(y) \\&\xLeftrightarrow{\ _{离散型}\ } P\{X=x_i,Y=y_i\}=P\{X=x_i\}P\{Y=y_i\} \\&\xLeftrightarrow{\ _{连续型}\ } f(x,y)=f_{_X}(x)f_{_Y}(y) \\&\xLeftrightarrow{\ _{条件密度}\ } f_{_{X|Y}}(x|y)=f_{_X}(x) \\&\begin{array}{l}\Rightarrow\\ \nLeftarrow\\\end{array}\ f(X)\ ,\ g(Y) \end{aligned}X,Y互独 def  P{Xx,Yy}=P{Xx}P{Yy}F(x,y)=FX(x)FY(y) 离散型 P{X=xi,Y=yi}=P{X=xi}P{Y=yi} 连续型 f(x,y)=fX(x)fY(y) 条件密度 fXY(xy)=fX(x) f(X) , g(Y)

5.4 r.v.函数

F Z ( z ) = P { g ( X , Y ) ⩽ z } = { ∑ g ( x i , y i ) ⩽ z p i j ∬ g ( x i , y i ) ⩽ z f ( x , y ) d x d y = X , Y 互独 { ∑ g ( x i , y i ) ⩽ z p i p j ∬ g ( x i , y i ) ⩽ z f X ( x ) f Y ( y ) d x d y F_{_Z}(z)=P\{g(X,Y)\leqslant z\}=\left\{\begin{array}{l}{\displaystyle \sum\limits_{g(x_i,y_i)\leqslant z}}p_{ij}\\ \\ {\displaystyle \iint\limits_{g(x_i,y_i)\leqslant z}f(x,y)\mathrm{d}x\mathrm{d}y}\\ \end{array}\right. \xlongequal{X,Y互独}\left\{\begin{array}{l} {\displaystyle \sum\limits_{g(x_i,y_i)\leqslant z}}p_ip_j\\ \\ {\displaystyle \iint\limits_{g(x_i,y_i)\leqslant z}f_{_X}(x)f_{_Y}(y)\mathrm{d}x\mathrm{d}y}\\ \end{array}\right.FZ(z)=P{g(X,Y)z}=g(xi,yi)zpijg(xi,yi)zf(x,y)dxdyX,Y互独g(xi,yi)zpipjg(xi,yi)zfX(x)fY(y)dxdy

6 数字特征

6.1 期望

定义:
E ( X ) = ∑ i = 1 ∞ x i p i = ∫ − ∞ + ∞ x f ( x ) d x \displaystyle E(X)=\sum_{i=1}^{\infty}x_ip_i=\int_{-\infty}^{+\infty}xf(x)\mathrm{d}xE(X)=i=1xipi=+xf(x)dx 绝对收敛

性质:

  1. E ( C ) = C ∈ R E(C)=C\in\mathbb{R}E(C)=CR
  2. E ( C X ) = C E ( X ) E(CX)=CE(X)E(CX)=CE(X)
  3. E ( X ± Y ) = E ( X ) ± E ( Y ) E(X\pm Y)=E(X)\pm E(Y)E(X±Y)=E(X)±E(Y)
  4. X , Y X,YX,Y 独立 ⇒   E ( X Y ) = E ( X ) E ( Y ) ⇔ ρ X Y = 0 \Rightarrow\ E(XY)=E(X)E(Y) \Leftrightarrow \rho_{XY}=0 E(XY)=E(X)E(Y)ρXY=0,即 X , Y X,YX,Y不相关
  5. E ( X 2 ) = D ( X ) + [ E ( X ) ] 2 ; E ( X n ) E(X^2)=D(X)+[E(X)]^2;E(X^n)E(X2)=D(X)+[E(X)]2;E(Xn) 则使用定义式,注意奇偶性化简

r.v.函数:
E ( Y ) = E [ g ( X ) ] = { ∑ i = 1 ∞ g ( x i ) p i ∫ − ∞ + ∞ g ( x ) f ( x ) d x E(Y)=E[g(X)]=\left\{\begin{array}{l} \displaystyle \sum\limits_{i=1}^{\infty}g(x_i)p_i\\ \\ \displaystyle \int_{-\infty}^{+\infty}g(x)f(x)\mathrm{d}x\\ \end{array}\right.E(Y)=E[g(X)]=i=1g(xi)pi+g(x)f(x)dx

E ( Z ) = E [ g ( X , Y ) ] = { ∑ j = 1 ∞ ∑ i = 1 ∞ g ( x i , y i ) p i j ∫ − ∞ + ∞ ∫ − ∞ + ∞ g ( x , y ) f ( x , y ) d x d y E(Z)=E[g(X,Y)]=\left\{\begin{array}{l} \displaystyle \sum\limits_{j=1}^{\infty}\sum\limits_{i=1}^{\infty}g(x_i,y_i)p_{ij}\\ \\ \displaystyle \int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}g(x,y)f(x,y)\mathrm{d}x\mathrm{d}y\\ \end{array}\right.E(Z)=E[g(X,Y)]=j=1i=1g(xi,yi)pij++g(x,y)f(x,y)dxdy

6.2 方差

定义:
D ( X ) = E { [ X − E ( X ) ] 2 } = E ( X 2 ) − [ E ( X ) ] 2 D(X)=E\{[X-E(X)]^2\}=E(X^2)-[E(X)]^2D(X)=E{[XE(X)]2}=E(X2)[E(X)]2

标准化r.v.:
E ( X − E X D X ) = 0   ,   D ( X − E X D X ) = 1 E\left(\frac{X-EX}{\sqrt{DX}}\right)=0\ ,\ D\left(\frac{X-EX}{\sqrt{DX}}\right)=1E(DXXEX)=0 , D(DXXEX)=1

性质:

  1. D ( C ) = 0 D(C)=0D(C)=0
  2. D ( C X ) = C 2 D ( X ) D(CX)=C^2D(X)D(CX)=C2D(X)
  3. D ( X ± Y ) = D ( X ) + D ( Y ) ± 2 C o v ( X , Y ) = 独立 ⇒ 不相关 D ( X ) + D ( Y ) D(X\pm Y)=D(X)+D(Y)\pm 2Cov(X,Y)\xlongequal{独立\Rightarrow 不相关}D(X)+D(Y)D(X±Y)=D(X)+D(Y)±2Cov(X,Y)独立不相关D(X)+D(Y)
  4. D ( X ) = 0 ⇔ P { X = E ( X ) } = 1 D(X)=0\Leftrightarrow P\{X=E(X)\}=1D(X)=0P{X=E(X)}=1

6.3 协方差

定义:
C o v ( X , Y ) = E [ ( X − E X ) ( Y − E Y ) ] = E ( X Y ) − E ( X ) E ( Y ) \mathrm{Cov}(X,Y)=E[(X-EX)(Y-EY)]=E(XY)-E(X)E(Y)Cov(X,Y)=E[(XEX)(YEY)]=E(XY)E(X)E(Y)

性质:

  1. C o v ( X , Y ) = C o v ( Y , X ) \mathrm{Cov}(X,Y)=\mathrm{Cov}(Y,X)Cov(X,Y)=Cov(Y,X)
  2. C o v ( X , X ) = D ( X ) \mathrm{Cov}(X,X)=D(X)Cov(X,X)=D(X)
    D ( X ± Y ) = D ( X ) + D ( Y ) ± 2 C o v ( X , Y ) D(X\pm Y)=D(X)+D(Y)\pm 2\mathrm{Cov}(X,Y)D(X±Y)=D(X)+D(Y)±2Cov(X,Y)
    D ( ∑ i = 1 n X i ) = ∑ i = 1 n D ( X i ) + ∑ 1 ⩽ i < j ⩽ n 2 C o v ( X i , X j ) \displaystyle D\left(\sum\limits_{i=1}^{n}X_i\right)=\sum\limits_{i=1}^{n}D(X_i)+\sum\limits_{1 \leqslant i < j \leqslant n}2\mathrm{Cov}(X_i,X_j)D(i=1nXi)=i=1nD(Xi)+1i<jn2Cov(Xi,Xj)
  3. C o v ( a X , b Y ) = a b   C o v ( X , Y ) \mathrm{Cov}(aX,bY)=ab\ \mathrm{Cov}(X,Y)Cov(aX,bY)=ab Cov(X,Y)
  4. C o v ( X 1 + X 2 , Y ) = C o v ( X 1 , Y ) + C o v ( X 2 , Y ) \mathrm{Cov}(X_1+X_2,Y)=\mathrm{Cov}(X_1,Y)+\mathrm{Cov}(X_2,Y)Cov(X1+X2,Y)=Cov(X1,Y)+Cov(X2,Y)

6.4 线性相关系数

定义:
ρ X Y = C o v ( X , Y ) D ( X ) D ( Y ) \rho_{_{XY}}=\frac{\mathrm{Cov}(X,Y)}{\sqrt{D(X)}\sqrt{D(Y)}}ρXY=D(X)D(Y)Cov(X,Y)

性质:

  1. ∣ ρ X Y ∣ ⩽ 1 |\rho_{_{XY}}|\leqslant 1ρXY1
  2. ∣ ρ X Y ∣ = 1 ⇔ ∃   a , b ∈ R , s . t .   P { Y = a X + b } = 1 |\rho_{_{XY}}|= 1 \Leftrightarrow \exists\ a,b\in\mathbb{R},\mathrm{s.t.}\ P\{Y=aX+b\}=1ρXY=1 a,bR,s.t. P{Y=aX+b}=1,且 ρ X Y = {    1   , a > 0 − 1 , a < 0 \rho_{_{XY}}=\left\{\begin{array}{l}{\ \ 1\ ,a>0}\\{-1,a<0}\\\end{array}\right.ρXY={  1 ,a>01,a<0

独立/不相关的结论:

  1. X , Y X,YX,Y 独立 ( P ( X Y ) = P ( X ) P ( Y ) )   ⇒ ⇍ X , Y (P(XY)=P(X)P(Y))\ \begin{array}{l}{\Rightarrow}\\{\nLeftarrow}\\\end{array} X,Y(P(XY)=P(X)P(Y)) X,Y 不相关 ⇔ ρ X Y = 0   { ⇔ C o v ( X , Y ) = 0 ⇔ E ( X Y ) − E ( X ) E ( Y ) = 0 ⇔ D ( X ± Y ) = D ( X ) ± D ( Y ) \Leftrightarrow \rho_{_{XY}}=0\ \left\{\begin{array}{l}{\Leftrightarrow \mathrm{Cov}(X,Y)=0}\\{\Leftrightarrow E(XY)-E(X)E(Y)=0}\\{\Leftrightarrow D(X\pm Y)=D(X)\pm D(Y)}\\\end{array}\right.ρXY=0 Cov(X,Y)=0E(XY)E(X)E(Y)=0D(X±Y)=D(X)±D(Y)

Tips: 不相关即二者无线性关系,但可能存在其他关系;独立即不存在任何关系.

  1. ( X , Y ) ∼ N ;   X , Y (X,Y)\sim N;\ X,Y(X,Y)N; X,Y 独立   ⇔   X , Y \ \Leftrightarrow\ X,Y  X,Y 不相关
  2. X , Y ∼ B ( 1 , p ) ;   X , Y X,Y\sim B(1,p);\ X,YX,YB(1,p); X,Y 独立   ⇔   X , Y \ \Leftrightarrow\ X,Y  X,Y 不相关

6.5 矩

k kk 阶原点矩:E ( X k ) E(X^k)E(Xk)
k kk 阶中心矩:E { [ X − E ( X ) ] k } E\{[X-E(X)]^k\}E{[XE(X)]k}
k + l k+lk+l 阶混合矩:E ( X k Y l ) ,   E [ ( X − E X ) k ( Y − E Y ) l ] E(X^kY^l),\ E[(X-EX)^k(Y-EY)^l]E(XkYl), E[(XEX)k(YEY)l]

期望 E ( X ) E(X)E(X) 即为一阶原点矩,方差 D ( X ) = E { [ X − E ( X ) ] 2 } D(X)=E\{[X-E(X)]^2\}D(X)=E{[XE(X)]2} 为二阶中心矩.

7 统计量

7.1 相关概念

总体: 所要研究问题有关个体的全体构成的集合.
样本: 按一定规定从总体中抽取的一部分个体.

抽到哪些个体是随机的,因而样本为随机变量。
一组样本由简单随机抽样得到,因而相互独立。

总体和样本都为随机变量,且同分布.

例: 研究全国人的年龄,总体为全国所有人的年龄,个体:每个人的年龄,随机变量可以抽取全国任何一个人的年龄,一组样本从中抽取n个人的年龄。
矩估计即由样本估计总体,采用替换原理,使用样本矩替换总体矩。

7.2 统计量

样本均值:
X ˉ = 1 n ∑ i = 1 n X i   ⇒ { E ( X ˉ ) = μ D ( X ˉ ) = D ( X ) n = σ 2 n \bar{X}=\frac{1}{n}\sum\limits_{i=1}^{n}X_i \ \Rightarrow \left\{\begin{array}{l} E(\bar{X})=\mu\\ \\ \displaystyle D(\bar{X})=\frac{D(X)}{n}=\frac{\sigma^2}{n}\\ \end{array}\right.Xˉ=n1i=1nXi E(Xˉ)=μD(Xˉ)=nD(X)=nσ2

样本方差:
S 2 = 1 n − 1 ∑ i = 1 n ( X i − X ˉ ) 2 = 1 n − 1 ( ∑ i = 1 n X i 2 − n X ˉ ) ⇒ E ( S 2 ) = D ( X ) = σ 2 S^2=\frac{1}{n-1}\sum\limits_{i=1}^{n}(X_i-\bar{X})^2=\frac{1}{n-1}\left(\sum\limits_{i=1}^{n}X_i^2-n\bar{X}\right) \Rightarrow E(S^2)=D(X)=\sigma^2S2=n11i=1n(XiXˉ)2=n11(i=1nXi2nXˉ)E(S2)=D(X)=σ2

常用的计算数字特征替换:

∑ i = 1 n X i = n X ˉ \sum\limits_{i=1}^{n}X_i=n\bar{X}i=1nXi=nXˉ
( n − 1 ) S 2 = ∑ i = 1 n ( X i − X ˉ ) (n-1)S^2=\sum\limits_{i=1}^{n}(X_i-\bar{X})(n1)S2=i=1n(XiXˉ)
E ( χ 2 ( n ) ) = n , D ( χ 2 ( n ) ) = 2 n E(\chi^2(n))=n,D(\chi^2(n))=2nE(χ2(n))=n,D(χ2(n))=2n

正态分布总体:

  1. X ˉ \bar{X}XˉS 2 S^2S2 相互独立,即 X ˉ \bar{X}Xˉ∑ i = 1 n ( X i − X ˉ ) 2 \sum\limits_{i=1}^n(X_i-\bar{X})^2i=1n(XiXˉ)2 独立

⇒ { X ˉ 与 S 2 不相关, E ( X ˉ S 2 ) = E ( X ˉ ) E ( S 2 ) f ( X ˉ ) 与 g ( S 2 ) 相互独立 ⇒ f ( X ˉ ) 与 g ( S 2 ) 不相关 \Rightarrow \left\{\begin{array}{l} \bar{X} 与 S^2 不相关,E(\bar{X}S^2)=E(\bar{X})E(S^2)\\ \\ f(\bar{X})与g(S^2)相互独立\Rightarrow f(\bar{X})与g(S^2) 不相关\\ \end{array}\right.XˉS2不相关,E(XˉS2)=E(Xˉ)E(S2)f(Xˉ)g(S2)相互独立f(Xˉ)g(S2)不相关

  1. ( n − 1 ) S 2 σ 2 = ∑ i = 1 n ( X i − X ˉ σ ) 2 ∼ χ 2 ( n − 1 ) ⇒ D ( χ 2 ( n ) ) = 2 n D ( ( n − 1 ) S 2 σ 2 ) = 2 ( n − 1 ) ⇒ D ( S 2 ) = 2 σ 4 n − 1 \frac{(n-1)S^2}{\sigma^2}=\sum\limits_{i=1}^{n}\left(\frac{X_i-\bar{X}}{\sigma}\right)^2\sim \chi^2(n-1) \xRightarrow{D(\chi^2(n))=2n} D\left(\frac{(n-1)S^2}{\sigma^2}\right)=2(n-1)\Rightarrow D(S^2)=\frac{2\sigma^4}{n-1}σ2(n1)S2=i=1n(σXiXˉ)2χ2(n1)D(χ2(n))=2nD(σ2(n1)S2)=2(n1)D(S2)=n12σ4

  2. X ˉ − μ σ / n ( n − 1 ) S 2 σ 2 / ( n − 1 ) = X ˉ − μ S / n ∼ t ( n − 1 ) \displaystyle \frac{\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}} {\sqrt{\frac{(n-1)S^2}{\sigma^2}/(n-1)}}=\frac{\bar{X}-\mu}{S/\sqrt{n}}\sim t(n-1)σ2(n1)S2/(n1)σ/nXˉμ=S/nXˉμt(n1)

  3. X ˉ ∼ N ( μ , σ 2 n ) ⇒ X ˉ − μ σ / n ∼ N ( 0 , 1 ) ⇒ μ = 0 n X ˉ 2 σ 2 ∼ χ 2 ( 1 ) ⇒ D ( n X ˉ 2 σ 2 ) = 2 ⇒ D ( X ˉ 2 ) = 2 σ 4 n 2 \displaystyle \bar{X}\sim N(\mu,\frac{\sigma^2}{n})\Rightarrow \frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\sim N(0,1) \xRightarrow{\mu=0} \frac{n\bar{X}^2}{\sigma^2}\sim \chi^2(1) \Rightarrow D\left(\frac{n\bar{X}^2}{\sigma^2}\right)=2 \Rightarrow D(\bar{X}^2)=\frac{2\sigma^4}{n^2}XˉN(μ,nσ2)σ/nXˉμN(0,1)μ=0σ2nXˉ2χ2(1)D(σ2nXˉ2)=2D(Xˉ2)=n22σ4

  4. X i − μ σ ∼ N ( 0 , 1 ) ⇒ ∑ i = 1 n ( X i − μ σ ) 2 = 1 σ 2 ∑ i = 1 n ( X i − μ ) 2 ∼ χ 2 ( n ) ⇒ μ = 0 X 2 σ 2 ∼ χ 2 ( 1 ) ⇒ D ( X 2 σ 2 ) = 2 ⇒ D ( X 2 ) = 2 σ 2 \displaystyle \frac{X_i-\mu}{\sigma}\sim N(0,1)\Rightarrow \sum\limits_{i=1}^{n}\left(\frac{X_i-\mu}{\sigma}\right)^2=\frac{1}{\sigma^2}\sum\limits_{i=1}^{n}(X_i-\mu)^2\sim \chi^2(n)\xRightarrow{\mu=0} \frac{X^2}{\sigma^2}\sim \chi^2(1) \Rightarrow D\left(\frac{X^2}{\sigma^2}\right)=2\Rightarrow D(X^2)=2\sigma^2σXiμN(0,1)i=1n(σXiμ)2=σ21i=1n(Xiμ)2χ2(n)μ=0σ2X2χ2(1)D(σ2X2)=2D(X2)=2σ2

Tips: 计算某量的平方方差,标准化转化卡方分布,降次.

8 参数估计

8.1 点估计

8.1.1 矩估计

8.1.1.1 样本矩与总体矩

样本矩:
k kk 阶原点矩:A k = 1 n ∑ i = 1 n X i k \displaystyle A_k=\frac{1}{n}\sum\limits_{i=1}^{n}X_i^kAk=n1i=1nXik
k kk 阶中心矩:B k = 1 n ∑ i = 1 n ( X i − X ˉ ) k \displaystyle B_k=\frac{1}{n}\sum\limits_{i=1}^{n}(X_i-\bar{X})^kBk=n1i=1n(XiXˉ)k

总体矩:
k kk 阶原点矩:E ( X k ) E(X^k)E(Xk)
k kk 阶中心矩:E { [ X − E ( X ) ] k } E\{[X-E(X)]^k\}E{[XE(X)]k}

8.1.1.2 替换原理

由样本矩代替总体矩,即 A k : = E ( X k ) , B k : = E { [ X − E ( X ) ] k } A_k:=E(X^k),B^k:=E\{[X-E(X)]^k\}Ak:=E(Xk),Bk:=E{[XE(X)]k},联立方程求解待故参数,从而由样本估计总体.

8.1.1.3 参数方程联立

k kk 个待估参数 θ i \theta_iθi,联立 k kk 个方程,构建含参且易解方程。(不同方法估计结果可能不唯一)

8.1.1.3.1 单参估计

一阶原点矩方程:
1 n ∑ i = 1 n X i   : =   E ( X ) = g ( θ ) \frac{1}{n}\sum\limits_{i=1}^{n}X_i \ :=\ E(X)=g(\theta)n1i=1nXi := E(X)=g(θ)

若一阶矩计算 E ( X ) E(X)E(X) 后不含待估参数则使用二阶矩方程.

二阶原点矩方程:
1 n ∑ i = 1 n X i 2   : =   E ( X 2 ) = g ( θ ) \frac{1}{n}\sum\limits_{i=1}^{n}X_i^2 \ :=\ E(X^2)=g(\theta)n1i=1nXi2 := E(X2)=g(θ)

8.1.1.3.2 双参估计

一阶与二阶原点矩联立方程组:
{ 1 n ∑ i = 1 n X i   : =   E ( X ) = g ( θ 1 , θ 2 ) 1 n ∑ i = 1 n X i 2   : =   E ( X 2 ) = h ( θ 1 , θ 2 ) \left\{\begin{array}{l} \frac{1}{n}\sum\limits_{i=1}^{n}X_i \ :=\ E(X)=g(\theta_1,\theta_2)\\ \\ \frac{1}{n}\sum\limits_{i=1}^{n}X_i^2 \ :=\ E(X^2)=h(\theta_1,\theta_2)\\ \end{array}\right.n1i=1nXi := E(X)=g(θ1,θ2)n1i=1nXi2 := E(X2)=h(θ1,θ2)

一阶原点矩与二阶中心矩联立方程组:
{ 1 n ∑ i = 1 n X i   : =   E ( X ) = g ( θ 1 , θ 2 ) 1 n ∑ i = 1 n ( X i − X ˉ ) 2   : =   E { [ X − E X ] 2 } = h ( θ 1 , θ 2 ) \left\{\begin{array}{l} \frac{1}{n}\sum\limits_{i=1}^{n}X_i \ :=\ E(X)=g(\theta_1,\theta_2)\\ \\ \frac{1}{n}\sum\limits_{i=1}^{n}(X_i-\bar{X})^2 \ :=\ E\{[X-EX]^2\}=h(\theta_1,\theta_2)\\ \end{array}\right.n1i=1nXi := E(X)=g(θ1,θ2)n1i=1n(XiXˉ)2 := E{[XEX]2}=h(θ1,θ2)

8.1.2 极大似然估计

Step1:
构造似然函数 L ( θ ) = L ( x 1 , ⋯ , x n   ; θ 1 , ⋯ , θ k ) = ∏ i = 1 n f ( x i   ; θ 1 , ⋯ , θ k ) L(\theta)=L(x_1,\cdots,x_n\ ;\theta_1,\cdots,\theta_k)=\prod\limits_{i=1}^{n}f(x_i\ ;\theta_1,\cdots,\theta_k)L(θ)=L(x1,,xn ;θ1,,θk)=i=1nf(xi ;θ1,,θk)

Step2:
利用导数求极大值点
d L ( θ ) d θ j : = 0   ⇒   θ j ^   ( j = 1 , ⋯ , k ) \displaystyle \frac{\mathrm{d}L(\theta)}{\mathrm{d}\theta_j} :=0\ \Rightarrow\ \hat{\theta_j}\ (j=1,\cdots,k)dθjdL(θ):=0  θj^ (j=1,,k)

若直接求导不便,可将似然函数取对数 ln ⁡ L ( θ ) = ∑ i = 1 n ln ⁡ f ( x i ; θ 1 , ⋯ , θ k ) \displaystyle\ln{L(\theta)}=\sum\limits_{i=1}^n\ln{f(x_i;\theta_1,\cdots,\theta_k)}lnL(θ)=i=1nlnf(xi;θ1,,θk) ,而后求导,
d ln ⁡ L ( θ ) d θ j : = 0   ⇒   θ j ^   ( j = 1 , 2 , ⋯ , k ) \displaystyle \frac{\mathrm{d}\ln L(\theta)}{\mathrm{d}\theta_j} :=0\ \Rightarrow\ \hat{\theta_j}\ (j=1,2,\cdots,k)dθjdlnL(θ):=0  θj^ (j=1,2,,k)

Step3:
若有解,所得即所求;
若无解,则似然函数单调,估值应在边界点处取得,即 { ↑   :   θ ^ = min ⁡ { X i } ↓   :   θ ^ = max ⁡ { X i } \left\{\begin{array}{l} \uparrow\ :\ \hat{\theta}=\min\{X_i\}\\ \downarrow\ :\ \hat{\theta}=\max\{X_i\}\\ \end{array}\right.{ : θ^=min{Xi} : θ^=max{Xi}

8.1.3 估计量评选标准

无偏性:

E ( θ ^ ) = θ E(\hat{\theta})=\thetaE(θ^)=θ,则 θ \thetaθ 为未知参数 θ \thetaθ 的无偏估计量.

有效性:

E ( θ 1 ^ ) = E ( θ 2 ^ ) = θ , D ( θ 1 ^ ) < D ( θ 2 ^ ) E(\hat{\theta_1})=E(\hat{\theta_2})=\theta,D(\hat{\theta_1})<D(\hat{\theta_2})E(θ1^)=E(θ2^)=θ,D(θ1^)<D(θ2^),则 θ 1 ^ \hat{\theta_1}θ1^θ 2 ^ \hat{\theta_2}θ2^ 更有效.

一致性(相合性):

θ n ^ = θ ^ ( X 1 , ⋯ , X n ) \hat{\theta_n}=\hat{\theta}(X_1,\cdots,X_n)θn^=θ^(X1,,Xn) 依概率收敛于 θ \thetaθ,即 lim ⁡ n → ∞ P { ∣ θ n ^ − θ ∣ < ε } = 1 \displaystyle\lim_{n\to\infty}P\{|\hat{\theta_n}-\theta|<\varepsilon\}=1nlimP{θn^θ<ε}=1,则 θ ^ \hat{\theta}θ^ 为一致估计量.

Tips: 由辛钦大数定律 或 切比雪夫不等式判别.

8.1.4 大数定律与中心极限定理

Chebyshev 不等式:

{ ∃   E ( X i ) = μ ∃   D ( X i ) = σ 2 ⇒   ∀ ε > 0    { P { ∣ X − E ( X ) ∣ ⩾ ε } ⩽ D ( X ) ε 2 P { ∣ X − E ( X ) ∣ < ε } ⩾ 1 − D ( X ) ε 2 \left\{\begin{array}{l} \exists\ E(X_i)=\mu\\ \\ \exists\ D(X_i)=\sigma^2\\ \end{array}\right. \xRightarrow{\ \forall \varepsilon >0\ \ } \left\{\begin{array}{l} P\{|X-E(X)|\geqslant \varepsilon\}\leqslant \frac{D(X)}{\varepsilon^2}\\ \\ P\{|X-E(X)|< \varepsilon\}\geqslant 1-\frac{D(X)}{\varepsilon^2}\\ \end{array}\right. E(Xi)=μ D(Xi)=σ2 ε>0  P{XE(X)ε}ε2D(X)P{XE(X)<ε}1ε2D(X)

Khinchin 大数定律: 样本均值依概率收敛于期望.

{ X i    I . I . D . ∃   E ( X i ) = μ ⇒   ∀ ε > 0    lim ⁡ n → ∞ P { ∣ 1 n ∑ i = 1 n X i − μ ∣ < ε } = 1 \left\{\begin{array}{l} X_i \ \ I.I.D.\\ \exists\ E(X_i)=\mu\\ \end{array}\right. \xRightarrow{\ \forall \varepsilon >0\ \ } \lim_{n\to\infty}P\left\{\lvert\frac{1}{n}\sum_{i=1}^nX_i-\mu\rvert<\varepsilon\right\}=1{Xi  I.I.D. E(Xi)=μ ε>0  nlimP{n1i=1nXiμ<ε}=1

Lindberg-levi 中心极限定理: 样本均值依分布收敛于标准正态.

{ X i    I . I . D . ∃   E ( X i ) = μ ∃   D ( X i ) = σ 2 ⇒   ∀ x ∈ R    lim ⁡ n → ∞ P { ∣ 1 n ∑ i = 1 n X i − μ σ n ∣ < x } = Φ ( x ) \left\{\begin{array}{l} X_i\ \ I.I.D.\\ \exists\ E(X_i)=\mu\\ \exists\ D(X_i)=\sigma^2 \end{array}\right. \xRightarrow{\ \forall x \in\mathbb{R}\ \ } \lim_{n\to\infty}P\left\{\lvert\frac{\frac{1}{n}\sum_{i=1}^nX_i-\mu}{\frac{\sigma}{\sqrt{n}}}\rvert<x\right\}=\Phi(x)Xi  I.I.D. E(Xi)=μ D(Xi)=σ2 xR  nlimP{nσn1i=1nXiμ<x}=Φ(x)

8.2 区间估计

8.2.1 置信区间

θ \thetaθ 是总体的一个参数,参数空间为 Θ \ThetaΘx k ( k = 1 , ⋯ , n ) x_k(k=1,\cdots,n)xk(k=1,,n) 是来自该总体的样本,对于给定的 α ( 0 < α < 1 ) \alpha(0<\alpha<1)α(0<α<1) ,假设有两个统计量 θ ^ L = θ ^ L ( x 1 , ⋯ , x n ) , θ ^ U = θ ^ U ( x 1 , ⋯ , x n ) \hat{\theta}_L=\hat{\theta}_L(x_1,\cdots,x_n),\hat{\theta}_U=\hat{\theta}_U(x_1,\cdots,x_n)θ^L=θ^L(x1,,xn),θ^U=θ^U(x1,,xn) ,对于任意的 θ ∈ Θ \theta\in \ThetaθΘ,有
P θ ( θ ^ L ⩽ θ ⩽ θ ^ U ) = 1 − α P_{\theta}\left(\hat{\theta}_L\leqslant \theta \leqslant\hat{\theta}_U\right)=1-\alphaPθ(θ^Lθθ^U)=1α
则称 [ θ ^ L , θ ^ U ] [\hat{\theta}_L,\hat{\theta}_U][θ^L,θ^U]θ \thetaθ 的置信度为 1 − α 1-\alpha1α 的同等置信区间,α \alphaα 称显著性水平.

8.2.2 枢轴变量法

X ∼ N ( μ , σ 2 ) X\sim N(\mu,\sigma^2)XN(μ,σ2),求解参数 μ \muμ 一个置信度为 1 − α 1-\alpha1α 的区间估计.
(1) σ 2 \sigma^2σ2 已知时 μ \muμ 的置信区间;
(2) σ 2 \sigma^2σ2 未知时 μ \muμ 的置信区间.

sol: (1)
Y = X ˉ − μ σ / n ∼ N ( 0 , 1 ) \displaystyle Y=\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\sim N(0,1)Y=σ/nXˉμN(0,1)
1 − α = P { − u α 2 ⩽ Y = X ˉ − μ σ / n ⩽ u α 2 } = P { X ˉ − σ n u α 2 ⩽ μ ⩽ X ˉ + σ n u α 2 } \begin{aligned} 1-\alpha &=P\left\{-u_{\frac{\alpha}{2}}\leqslant Y=\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\leqslant u_{\frac{\alpha}{2}}\right\} \\&=P\left\{\bar{X}-\frac{\sigma}{\sqrt{n}}u_{\frac{\alpha}{2}}\leqslant \mu \leqslant \bar{X}+\frac{\sigma}{\sqrt{n}}u_{\frac{\alpha}{2}}\right\} \end{aligned}1α=P{u2αY=σ/nXˉμu2α}=P{Xˉnσu2αμXˉ+nσu2α}
(2)
Y = X ˉ − μ S / n ∼ t ( n − 1 ) \displaystyle Y=\frac{\bar{X}-\mu}{S/\sqrt{n}}\sim t(n-1)Y=S/nXˉμt(n1)
1 − α = P { − t α 2 ( n − 1 ) ⩽ Y = X ˉ − μ S / n ⩽ t α 2 ( n − 1 ) } = P { X ˉ − S n t α 2 ( n − 1 ) ⩽ μ ⩽ X ˉ + S n t α 2 ( n − 1 ) } \begin{aligned} 1-\alpha &=P\left\{-t_{\frac{\alpha}{2}}(n-1) \leqslant Y=\frac{\bar{X}-\mu}{S/\sqrt{n}}\leqslant t_{\frac{\alpha}{2}}(n-1)\right\} \\&=P\left\{\bar{X}-\frac{S}{\sqrt{n}}t_{\frac{\alpha}{2}}(n-1)\leqslant \mu \leqslant \bar{X}+\frac{S}{\sqrt{n}}t_{\frac{\alpha}{2}}(n-1)\right\} \end{aligned}1α=P{t2α(n1)Y=S/nXˉμt2α(n1)}=P{XˉnSt2α(n1)μXˉ+nSt2α(n1)}

9 假设检验


附录

概率论与数理统计(上篇)


版权声明:本文为qq_45224889原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。