【深度学习】序列生成模型(五):评价方法计算实例:计算BLEU-N得分

2023-12-20 16:42:18

??给定一个生成序列“The cat sat on the mat”和两个参考序列“The cat is on the mat”“The bird sat on the bush”分别计算BLEU-N和ROUGE-N得分(N=1或N =2时).

  • 生成序列 x = the?cat?sat?on?the?mat \mathbf{x}=\text{the cat sat on the mat} x=the?cat?sat?on?the?mat
  • 参考序列
    • s ( 1 ) = the?cat?is?on?the?mat \mathbf{s}^{(1)}=\text{the cat is on the mat} s(1)=the?cat?is?on?the?mat
    • s ( 2 ) = the?bird?sat?on?the?bush \mathbf{s}^{(2)}=\text{the bird sat on the bush} s(2)=the?bird?sat?on?the?bush

一、BLEU-N得分(Bilingual Evaluation Understudy)

在这里插入图片描述
在这里插入图片描述

1. 定义

??设 𝒙 为模型生成的候选序列, s ( 1 ) , ? , s ( K ) \mathbf{s^{(1)}}, ? , \mathbf{s^{(K)}} s(1),?,s(K) 为一组参考序列,𝒲 为从生成的候选序列中提取所有N元组合的集合。BLEU算法的精度(Precision)定义如下:

P N ( x ) = ∑ w ∈ W min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) ∑ w ∈ W c w ( x ) P_N(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})} PN?(x)=wW?cw?(x)wW?min(cw?(x),maxk=1K?cw?(s(k)))?

其中 c w ( x ) c_w(\mathbf{x}) cw?(x) 是N元组合 w w w生成序列 x \mathbf{x} x中出现的次数, c w ( s ( k ) ) c_w(\mathbf{s}^{(k)}) cw?(s(k)) 是N元组合 w w w参考序列 s ( k ) \mathbf{s}^{(k)} s(k) 中出现的次数。

??为了处理生成序列长度短于参考序列的情况,引入长度惩罚因子 b ( x ) b(\mathbf{x}) b(x)

b ( x ) = { 1 if? l x > l s exp ? ( 1 ? l s l x ) if? l x ≤ l s b(\mathbf{x}) = \begin{cases} 1 & \text{if } l_x > l_s \\ \exp\left(1 - \frac{l_s}{l_x}\right) & \text{if } l_x \leq l_s \end{cases} b(x)={1exp(1?lx?ls??)?if?lx?>ls?if?lx?ls??

其中 l x l_x lx? 是生成序列的长度, l s l_s ls? 是参考序列的最短长度。

??BLEU算法通过计算不同长度的N元组合的精度,并进行几何加权平均,得到最终的BLEU分数:

BLEU-N ( x ) = b ( x ) × exp ? ( ∑ N = 1 N ′ α N log ? P N ( x ) ) \text{BLEU-N}(\mathbf{x}) = b(\mathbf{x}) \times \exp\left( \sum_{N=1}^{N'} \alpha_N \log P_N(\mathbf{x})\right) BLEU-N(x)=b(x)×exp ?N=1N?αN?logPN?(x) ?

其中 N ′ N' N 为最长N元组合的长度, α N \alpha_N αN? 是不同N元组合的权重,一般设为 1 / N ′ 1/N' 1/N

2. 计算

N=1

  • 生成序列 x = the?cat?sat?on?the?mat \mathbf{x}=\text{the cat sat on the mat} x=the?cat?sat?on?the?mat
  • 参考序列
    • s ( 1 ) = the?cat?is?on?the?mat \mathbf{s}^{(1)}=\text{the cat is on the mat} s(1)=the?cat?is?on?the?mat
    • s ( 2 ) = the?bird?sat?on?the?bush \mathbf{s}^{(2)}=\text{the bird sat on the bush} s(2)=the?bird?sat?on?the?bush
  • W = ?the,?cat,?sat,?on,?mat \mathcal{W}=\text{ {the, cat, sat, on, mat}} W=?the,?cat,?sat,?on,?mat
    • w = the w=\text{the} w=the
      • c w ( x ) = 2 , c w ( s ( 1 ) ) = 2 , c w ( s ( 2 ) ) = 2 c_w(\mathbf{x})=2, c_w(\mathbf{s^{(1)}})=2,c_w(\mathbf{s^{(2)}})=2 cw?(x)=2,cw?(s(1))=2,cw?(s(2))=2
      • max ? k = 1 K c w ( s ( k ) ) ) = 2 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2 maxk=1K?cw?(s(k)))=2
      • min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) = 2 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2 min(cw?(x),maxk=1K?cw?(s(k)))=2
    • w = cat w=\text{cat} w=cat
      • c w ( x ) = 1 , c w ( s ( 1 ) ) = 1 , c w ( s ( 2 ) ) = 0 c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=0 cw?(x)=1,cw?(s(1))=1,cw?(s(2))=0
      • max ? k = 1 K c w ( s ( k ) ) ) = 1 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 maxk=1K?cw?(s(k)))=1
      • min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) = 1 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 min(cw?(x),maxk=1K?cw?(s(k)))=1
    • w = sat w=\text{sat} w=sat
      • c w ( x ) = 1 , c w ( s ( 1 ) ) = 0 , c w ( s ( 2 ) ) = 1 c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=0, c_w(\mathbf{s^{(2)}})=1 cw?(x)=1,cw?(s(1))=0,cw?(s(2))=1
      • max ? k = 1 K c w ( s ( k ) ) ) = 1 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 maxk=1K?cw?(s(k)))=1
      • min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) = 1 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 min(cw?(x),maxk=1K?cw?(s(k)))=1
    • w = on w=\text{on} w=on
      • c w ( x ) = 1 , c w ( s ( 1 ) ) = 1 , c w ( s ( 2 ) ) = 1 c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=1 cw?(x)=1,cw?(s(1))=1,cw?(s(2))=1
      • max ? k = 1 K c w ( s ( k ) ) ) = 1 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 maxk=1K?cw?(s(k)))=1
      • min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) = 1 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 min(cw?(x),maxk=1K?cw?(s(k)))=1
    • w = mat w=\text{mat} w=mat
      • c w ( x ) = 1 , c w ( s ( 1 ) ) = 1 , c w ( s ( 2 ) ) = 0 c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=0 cw?(x)=1,cw?(s(1))=1,cw?(s(2))=0
      • max ? k = 1 K c w ( s ( k ) ) ) = 1 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 maxk=1K?cw?(s(k)))=1
      • min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) = 1 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 min(cw?(x),maxk=1K?cw?(s(k)))=1
  • ∑ w ∈ W min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) = 2 + 1 + 1 + 1 + 1 + 1 = 6 \sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2+1+1+1+1+1=6 wW?min(cw?(x),maxk=1K?cw?(s(k)))=2+1+1+1+1+1=6
  • ∑ w ∈ W c w ( x ) = 1 + 1 + 1 + 1 + 1 + 1 = 6 \sum_{w \in \mathcal{W}} c_w(\mathbf{x})=1+1+1+1+1+1=6 wW?cw?(x)=1+1+1+1+1+1=6
  • P 1 ( x ) = ∑ w ∈ W min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) ∑ w ∈ W c w ( x ) = 6 6 = 1 P_1(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})}= \frac{6}{6}=1 P1?(x)=wW?cw?(x)wW?min(cw?(x),maxk=1K?cw?(s(k)))?=66?=1

N=2

  • 生成序列 x = the?cat?sat?on?the?mat \mathbf{x}=\text{the cat sat on the mat} x=the?cat?sat?on?the?mat
  • 参考序列
    • s ( 1 ) = the?cat?is?on?the?mat \mathbf{s}^{(1)}=\text{the cat is on the mat} s(1)=the?cat?is?on?the?mat
    • s ( 2 ) = the?bird?sat?on?the?bush \mathbf{s}^{(2)}=\text{the bird sat on the bush} s(2)=the?bird?sat?on?the?bush
  • W = the?cat,?cat?sat,?sat?on,?on?the,?the?mat? \mathcal{W}=\text{{the cat, cat sat, sat on, on the, the mat} } W=the?cat,?cat?sat,?sat?on,?on?the,?the?mat?
w w w c w ( x ) c_w(\mathbf{x}) cw?(x) c w ( s ( 1 ) ) c_w(\mathbf{s^{(1)}}) cw?(s(1)) c w ( s ( 2 ) ) c_w(\mathbf{s^{(2)}}) cw?(s(2)) max ? k = 1 K c w ( s ( k ) ) ) \max_{k=1}^{K} c_w(\mathbf{s}^{(k)})) maxk=1K?cw?(s(k))) min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)})) min(cw?(x),maxk=1K?cw?(s(k)))
the cat11011
cat sat10000
sat on10111
on the11111
the mat11011
  • ∑ w ∈ W min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) = 1 + 0 + 1 + 1 + 1 = 4 \sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1+0+1+1+1=4 wW?min(cw?(x),maxk=1K?cw?(s(k)))=1+0+1+1+1=4
  • ∑ w ∈ W c w ( x ) = 1 + 1 + 1 + 1 + 1 = 5 \sum_{w \in \mathcal{W}} c_w(\mathbf{x})=1+1+1+1+1=5 wW?cw?(x)=1+1+1+1+1=5
  • P 2 ( x ) = ∑ w ∈ W min ? ( c w ( x ) , max ? k = 1 K c w ( s ( k ) ) ) ∑ w ∈ W c w ( x ) = 4 5 P_2(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})}= \frac{4}{5} P2?(x)=wW?cw?(x)wW?min(cw?(x),maxk=1K?cw?(s(k)))?=54?

BLEU-N 得分

??为了处理生成序列长度短于参考序列的情况,引入长度惩罚因子 b ( x ) b(\mathbf{x}) b(x) b ( x ) = { 1 if? l x > l s exp ? ( 1 ? l s l x ) if? l x ≤ l s b(\mathbf{x}) = \begin{cases} 1 & \text{if } l_x > l_s \\ \exp\left(1 - \frac{l_s}{l_x}\right) & \text{if } l_x \leq l_s \end{cases} b(x)={1exp(1?lx?ls??)?if?lx?>ls?if?lx?ls??其中 l x l_x lx? 是生成序列的长度, l s l_s ls? 是参考序列的最短长度。

??这里 l x = l s ( 1 ) = l s ( 2 ) = 6 l_x=l_{s^{(1)}}=l_{s^{(2)}}=6 lx?=ls(1)?=ls(2)?=6,因此 b ( x ) = e ( 1 ? l s l x ) = e 0 = 1 b(\mathbf{x}) =e^{\left( 1 - \frac{l_s}{l_x} \right)}=e^0=1 b(x)=e(1?lx?ls??)=e0=1

??BLEU算法通过计算不同长度的N元组合的精度,并进行几何加权平均,得到最终的BLEU分数:
BLEU-N ( x ) = b ( x ) × exp ? ( 1 N ′ ∑ N = 1 N ′ α N log ? P N ( x ) ) \text{BLEU-N}(\mathbf{x}) = b(\mathbf{x}) \times \exp\left(\frac{1}{N'} \sum_{N=1}^{N'} \alpha_N \log P_N(\mathbf{x})\right) BLEU-N(x)=b(x)×exp ?N1?N=1N?αN?logPN?(x) ?其中 N ′ N' N 为最长N元组合的长度, α N \alpha_N αN? 是不同N元组合的权重,一般设为 1 / N ′ 1/N' 1/N
BLEU-N ( x ) = 1 × exp ? ( ∑ N = 1 2 1 2 log ? P N ( x ) ) = exp ? ( 1 2 log ? P 1 ( x ) + 1 2 log ? P 2 ( x ) ) = exp ? ( 1 2 log ? 1 + 1 2 log ? 4 5 ) = exp ? ( 0 + log ? 4 5 ) = 4 5 \text{BLEU-N}(\mathbf{x}) = 1 \times\exp\left( \sum_{N=1}^{2} \frac{1}{2} \log P_N(\mathbf{x})\right)\\ =\exp\left(\frac{1}{2}\log P_1(\mathbf{x})+\frac{1}{2}\log P_2(\mathbf{x)}\right)\\ =\exp\left(\frac{1}{2}\log 1+\frac{1}{2}\log \frac{4}{5}\right)\\ =\exp\left(0+\log \sqrt\frac{4}{5}\right)\\ =\sqrt\frac{4}{5} BLEU-N(x)=1×exp(N=12?21?logPN?(x))=exp(21?logP1?(x)+21?logP2?(x))=exp(21?log1+21?log54?)=exp(0+log54? ?)=54? ?

文章来源:https://blog.csdn.net/m0_63834988/article/details/135107231
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。