Ch3.1 Maximum likelihood and Bayesian parameter estimation

2020-2학기 서강대 김경환 교수님 강의 내용 및 패턴인식 교재를 바탕으로 본 글을 작성하였습니다.

3.1 Maximum likelihood and Bayesian parameter estimation - Introduction

2장에서는 $p(\omega_i)$ 와 클래스-조건부 밀도 $p(\mathbf{x}|\omega_i)$ 를 아는 경우, 최적 분류기를 설계하는 방법을 다뤘다. 하지만, 패턴인식 응용에서는 문제의 확률적 구조에 관한 이런 종류의 완전한 지식을 거의 갖지 않는다.

$P(w_j|\mathbf{x}) = \frac{p(\mathbf{x}|w_j)P(w_j)}{p(\mathbf{x})} = \frac{p(\mathbf{x}|w_j)P(w_j)}{\sum_{j=1}^{c} p\left(\mathbf{x} \mid \omega_{j}\right) P\left(\omega_{j}\right)}$

▶ An optimal classifier can be designed if we know $p(\omega_i)$ and $p(\mathbf{x}|\omega_i)$ - Ch 2

Complete knowledge aboud the probabilistic structure is rarely provide

Vague and general knowledge about the situation (상황에 관한 모호함)
Limited number of design samples or training data (학습 데이터에 대한 부족)

즉, 위의 두 상황에서의 분류기를 설계 또는 훈련시키는 어떤 방법을 찾는 것이 이번 장의 내용이다.

The problem

To find some way to use this informaion to design or train the classifer.

▶ An approach

To use the samples to estimate the unknown probabilities/densities, then use the resulting estimates as if they were the true values.

즉, 위 문제에 대한 한 가지 접근 방법은 샘플들을 이용해서 미지의 확률 및 밀도를 추정하고, 결과로 얻는 추정들을 마치 참 값인 것처럼 사용하는 것이다.

Estimating prior probabilities/class-conditional densities. (어려움)
The number of available samples always seems too small.
The dimensionality of the feature vector $\mathbf{x}$ is large. (차원의 저주)
If we know the number of parameters and our knowledge about the problems, the severity of these problems can be reduced.
- If $p(\mathbf{x} | \omega_i)$ is a normal density, the problem becomes to estimate $\mu_i$ and $\sum_i$ .

parameter 수를 미리알고, 문제에 관한 지식이 우리가 조건부 밀도들을 파라미터로 나타내는 것을 허용해준다면, 이 문제들의 심각성(어려움, 차원의 저주)은 현저하게 줄어든다. 예를 들어, $p(\mathbf{x}|\omega_i)$ 가 평균은 $\mu_i$ , 공분산 행렬은 $\sum_i$ 인 정규 분포라고 무리 없이 가정할 수 있다고 하면, parameter를 추정하는 것으로 아주 단순해진다.

▶ Paramter Estimaion (2 ways)

Maximum-likelihood estimation (MLE)

The parameters are regarded as quantities whose values are fixed but unknown.
The best estimate of their value is defined to be the one that maximizes the probability of obtaining the samples actually observed.

Bayesian estimation (MAP)

The parameters are regarded as random variables having some known prior distribution.
Observation of the samples converts this to a posterior density.
A typical effect of observing additional samples is to sharpen the a posteriori density function : Bayesian learning

다음 Ch3.1에서는 본격적으로 Parameter estimation 방법 중 likelihood(우도)를 최대화하는 방법으로 추정하는 "Maximum-likelihood Estimation" 를 다루도록 하겠습니다.

Reference

pattern classification by richard o. duda

저작자표시

'Pattern Classification [수업]' 카테고리의 다른 글

Ch3.3~4 Bayes Estimation [The univariate case] (0)	2020.09.25
Ch3.2 Maximum-likelihood Estimation (0)	2020.09.22
Ch2.8 Bayesian decision theory - Error Bounds for Normal Densities (0)	2020.09.17
Ch2.7 Bayesian decision theory - Error Probabilities and Integrals (0)	2020.09.17
Ch2.6 Bayesian decision theory - Discriminant functions for the normal density (1)	2020.09.17

수	목	금	토	일	월

+17°	+18°	+23°	+21°	+12°	+12°
+10°	+12°	+12°	+7°	+7°	+6°

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

DeepHaejoong

Ch3.1 Maximum likelihood and Bayesian parameter estimation

3.1 Maximum likelihood and Bayesian parameter estimation - Introduction

Reference

'Pattern Classification [수업]' 카테고리의 다른 글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

Ch3.1 Maximum likelihood and Bayesian parameter estimation

3.1 Maximum likelihood and Bayesian parameter estimation - Introduction

Reference

'Pattern Classification [수업]' 카테고리의 다른 글

관련글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역