├── example.ref.zh
├── example.hyp.zh
├── example.hyp.cs
├── example.ref.en
├── example.ref.cs
├── example.hyp.en
├── example.ref.ru
├── example.hyp.ru
├── README.md
└── chrF++.py


/example.ref.zh:
--------------------------------------------------------------------------------
 1 | 市社团办相关负责人介绍，本次检查范围包括2013至2015年度在市社团办领用社会团体会费统一收据的社会团体。
 2 | 里约奥组委在２０１２年制定的奥运传播战略中就曾说：“奥运会成功与否是由哪些在社交媒体分享故事的人们定义的。”
 3 | 而从本届奥运会的报道来看，各媒体在传播方式、产品形态、语言风格、新闻视角等诸多方面都较以往做了很大的改变。
 4 | 国际奥委会和世界各国的媒体从业者，看来都已经要迎接这一轮挑战了。
 5 | 8月11日，光大证券H股公开发售进入最后一天。有消息指出，招股前两日市场反应平淡。按照计划，其将于8月18日在港交所挂牌上市。
 6 | 国内一家券商分析师表示，东方证券在港基础并不算好，旗下资管业务才是东证所长。相比中银国际、中信证券等银行系券商，显得先天不足。
 7 | 专家指出，今年以来随着证券市场回归平淡，很多券商业绩都出现了下滑，但公司上市之后情况要更多看公司本身业务的发展情况，不能一概而论。
 8 | 此项调控新政出台后，苏州将成为全国第一个重启限购的二线城市。
 9 | 出台或者即将出台楼市调控政策的二线城市不止苏州，还包括合肥、南京等城市。
10 | 此前，合肥已经率先发布限贷政策，对于在合肥名下有两套房且有一套住房贷款未结清的购房者，银行将拒绝提供房贷服务。
11 | 


--------------------------------------------------------------------------------
/example.hyp.zh:
--------------------------------------------------------------------------------
 1 | 市政协会的有关负责人说，这次检查的范围包括在2013年至2015年期间在社会群众组织中接受社会团体收费统一收据的社会团体。
 2 | “奥运会的成功是由那些在社交媒体上分享故事的人所确定的。”里约奥运会组委会在确定2012年奥运会的传播战略时表示。
 3 | 从奥运会的新闻报道来看，所有媒体在传递方式、产品形式、语言风格、新闻视角和许多其他方面都做出了很多变化。
 4 | 看来，国际奥委会和世界各地的媒体工作者都必须迎接这一轮挑战。
 5 | 8月11日，这是光大证券 H股公开发售的最后一天。消息说，前两天市场反应并不特别。根据计划，将于8月18日在香港交易所上市。
 6 | 一位国内券商分析师说，香港的东方证券没有很好的基础。只有其资本管理业务才是它的实力。与中国银行国际和中信证券等相比，它有许多内在的问题。
 7 | 专家指出，随着证券市场回归扁平化，许多券商今年的业绩出现了下滑。但上市公司的详细情况取决于公司业务的详细发展，不能一概而论。
 8 | 在新协议出台后，苏州将成为第一个重启购买限制的二级城市。
 9 | 除苏州外，包括合肥、南京在内的其他二线城市也将出台房地产市场调控政策。
10 | 在这之前，合肥已率先发布了对贷款政策的限制。对于在合肥有两个套房并有一个房屋贷款未支付的人，他们将被拒绝从银行获得抵押贷款服务。
11 | 


--------------------------------------------------------------------------------
/example.hyp.cs:
--------------------------------------------------------------------------------
 1 | Nedávné nabídky evakuace tvoří režim a Rusko znělo jako slabě zahalené výhrůžky, uvedli chirurgové pastevci a další lékaři.
 2 | Lékaři uvedli, že minulý měsíc došlo k 42 útokům na zdravotnická zařízení v Sýrii, 15 z nich byly nemocnice, ve kterých pracují.
 3 | Co nás nejvíc trápí, protože lékaři si vybírají, kdo bude žít a kdo zemře.
 4 | Malé děti jsou někdy přivedeny do našich nouzových pokojů tak těžce raněných, že musíme upřednostnit ty, kteří mají lepší šance,
 5 | Nebo prostě nemají vybavení, které by jim pomohlo, uvedli lékaři.
 6 | Před dvěma týdny čtyři novorozenci, kteří lapali po dechu, se udusili k smrti poté, co výbuch snížil přísun kyslíku do jejich inkubátorů.
 7 | Lapali po dechu, jejich životy skončily dřív, než skutečně začaly.
 8 | V posledních týdnech síly věrné Bašáru Asadovi podporované ruskými silami ovládly povstalecké východní Aleppo s přes 200 000 lidí stále uvězněných uvnitř bez dodávek potravin.
 9 | Povstalcům, kteří jsou pod neustálým útokem ruského a syrského letectva, se v sobotu podařilo prolomit obklíčení a zahájit protiofenzivu.
10 | Obě strany přidaly posily jak ve městě, tak v okolních oblastech.
11 | 


--------------------------------------------------------------------------------
/example.ref.en:
--------------------------------------------------------------------------------
 1 | Recent offers of evacuation form the regime and Russia had sounded like thinly-veiled threats, said the surgeons paediatricians and other doctors.
 2 | Doctors said that last month, there were 42 attacks on medical facilities in Syria, 15 of which were hospitals in which they work.
 3 | What pains us most, as doctors, is choosing who will live and who will die.
 4 | Young children are sometimes brought into our emergency rooms so badly injured that we have to prioritise those with better chances,
 5 | or simply do not have the equipment to help them, said the doctors.
 6 | Two weeks ago, four newborn babies, gasping for air suffocated to death after a blast cut the oxygen supply to their incubators.
 7 | Gasping for air, their lives ended before they had really begun.
 8 | In recent weeks, forces loyal to Bashar al-Assad supported by Russian forces have seized control of the rebel-held east Aleppo with over 200,000 people still trapped inside with no food supplies.
 9 | The rebels, who are under constant attack by Russian and Syrian air force, managed to break through the siege on Saturday and launch a counteroffensive.
10 | Both sides have added reinforcements both in the city and the surrounding areas.
11 | 


--------------------------------------------------------------------------------
/example.ref.cs:
--------------------------------------------------------------------------------
 1 | Nedávné nabídky evakuace obyvatel od syrského režimu a Ruska zní jako jen lehce zastřené výhrůžky, podotkli pediatři, chirurgové a další lékaři.
 2 | Lékaři uvedli, že za uplynulý měsíc bylo zaznamenáno 42 útoků na lékařská zařízení v Sýrii, z toho 15 na nemocnice, ve kterých pracují.
 3 | Nejvíce smutní jsme z toho, že musíme rozhodovat o tom, kdo bude žít a kdo zemře.
 4 | Na pohotovost k nám přicházejí malé děti s tak vážnými zraněními, že musíme mezi nimi vybírat ty, u nichž je největší pravděpodobnost, že přežijí.
 5 | A někdy nemáme ani potřebný materiál, abychom jim pomohli, popsali lékaři.
 6 | Před dvěma týdny se udusili čtyři novorozenci, když výbuch přerušil dodávky kyslíku do jejich inkubátorů.
 7 | Lapali po dechu a pak jejich život skončil - dřív, než skutečně mohl začít, připomněli.
 8 | Armáda věrná prezidentovi Bašáru Asadovi, kterou podporují ruské síly, v minulých týdnech obklíčila povstalci ovládanou východní část Halabu, uvnitř se ocitlo bez dodávek potravin přes 200.000 lidí.
 9 | Rebelům, kteří jsou pod neustálými údery ruského a syrského letectva, se v sobotu po třech týdnech podařilo obklíčení prolomit a zahájit protiofenzivu.
10 | Oběma stranám poté přibyly ve městě a jeho okolí posily.
11 | 


--------------------------------------------------------------------------------
/example.hyp.en:
--------------------------------------------------------------------------------
 1 | Recent offers of evacuating residents from the Syrian regime and Russia sound like only thinly veiled threats, pediatricians, surgeons and other doctors have said.
 2 | Doctors said there had been 42 attacks on medical facilities in Syria over the past month, from 15 to the hospitals in which they work.
 3 | The most sad is that we have to decide who lives and who dies.
 4 | There are young children coming to the ER with such serious injuries that we have to pick among them the ones most likely to survive.
 5 | And sometimes we don't even have the necessary material to help them, the doctors described.
 6 | Four newborns were suffocated two weeks ago when an explosion interrupted the supply of oxygen to their incubators.
 7 | They were gasping for breath, and then their lives were over - before they could actually begin, they recalled.
 8 | An army loyal to President Bashar al-Assad, backed by Russian forces, has besieged the rebel-controlled eastern part of Halaba in past weeks, inside with no food supplies of over 200,000 people.
 9 | The rebels, who are under constant strikes by the Russian and Syrian air force, managed to break the siege on Saturday after three weeks and launch a counter-offensive.
10 | Both sides then had reinforcements in the city and its surrounding area.
11 | 


--------------------------------------------------------------------------------
/example.ref.ru:
--------------------------------------------------------------------------------
 1 | G-Power известна своим пристрастием к автомобилям BMW, но ничто человеческое ее мастерам не чуждо.
 2 | Действительно, почему бы не усовершенствовать G 63 - 571-сильную летающую кувалду с аэродинамикой танка и разгоном до сотни за 5,4 с?
 3 | Накачивая "мускулы" сумасшедшего "гелика", мастера воспользовались преимуществами, которые открывает 5,5-литровая наддувная "восьмерка".
 4 | А вот динамика разгона до сотни улучшилась на неразличимые 0,1 с, и теперь летающий "кирпич" способен на рывок за 5,3 с.
 5 | Впрочем, ускорения со средних скоростей наверняка даются ему заметно лучше.
 6 | Потолок скорости устрашает: если стандартный G 63 быстрее 210 км/ч не ездит, то тюнинговый вседорожник с деактивированным электронным "ошейником" способен выжать 250 км/ч.
 7 | Снаружи автомобиль практически неотличим от своего заводского первоисточника, за исключением других колесных дисков - 23-дюймовых Hurricane RR размерностью 305/35 ZR 23.
 8 | Люксовое подразделение Mercedes-Maybach анонсировало поразительное концептуальное купе длиной 6 м, которое представят в рамках Конкурса элегантности в Пеббл-Бич.
 9 | Компания Mercedes-Benz собирается конкурировать с "зеленым суббрендом" BMW i и будет продвигать свои электрические модели под отдельной торговой маркой.
10 | В ближайшие годы ожидается появление минимум четырех экологичных новинок.
11 | 


--------------------------------------------------------------------------------
/example.hyp.ru:
--------------------------------------------------------------------------------
 1 | "G-Power" известна своей простотой для автомобилей BMW, но в конце концов, к своим экспертам, транспортное средство является транспортным средством.
 2 | Действительно, почему бы не улучшить G 63: 571-л.с. летающий санузел с аэродинамикой цистерны, который идет от нуля до сотни в 5,4 секунды?
 3 | Чтобы выгнать "мышцы" этого сумасшедшего "Гелика", эксперты воспользовались выгодами, которые предлагает двигатель с сверхзаряженным двигателем V8.
 4 | Ну, скорость, при которой она ускоряется от нуля до сотни, улучшенная незаметно 0,1 секунды: Теперь летающий "кирпич" способен ударить по 100 в 5,3 секунды.
 5 | Ум, ускорение при средних скоростях, наверняка, заметно легче для него.
 6 | Максимальная скорость ужасает: В то время как стандарт G 63 не может идти быстрее, чем 210 км / ч, SUV, с отключенным электронным "воротником", способен ударить по 250 км / ч.
 7 | Снаружи транспортное средство практически не отличается от заводского оригинала, за исключением различных колец колес: 23-дюймовый ураган RR с 305 / 35ZR23.
 8 | Роскошный дивизион, Mercedes-Maybach, объявил поразительный, 6-метровый концепт-купе, который будет представлен в Конкурсе d 'Elegance в Pebble Beach.
 9 | Mercedes-Benz намерен конкурировать с "зеленым суббрендом", BMW i, и будет продвигать свои электрические модели под отдельным брендом.
10 | В ближайшие годы ожидается дебют по крайней мере четырех новых экологически безопасных моделей.
11 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # chrF
  2 |  a tool for calcualting character n-gram F score
  3 | 
  4 | By: Maja Popovic <maja.popovic.166@gmail.com>,  June 2017
  5 | 
  6 | 
  7 | chrF++ is a tool for automatic evaluation of machine translation output based on character n-gram precision and recall enhanced with word n-grams. 
  8 | The tool calculates the  F-score averaged on all character and word n-grams, where the default character n-gram order is 6 and word n-gram order is 2.  The arithmetic mean is used for n-gram averaging.
  9 | 
 10 | Recent experiments have shown that adding word 1-grams and 2-grams to the standard character 6-grams improves the Pearson correlation with direct human assessments. If you want to use only character n-grams, just set the word n-gram order to 0.
 11 | 
 12 | It is written in Python, so you have to install Python 2 or Python 3.
 13 | The option -h, --help outputs a description of the available command line options.
 14 | 
 15 | 
 16 | Required inputs:
 17 | ++++++++++++++++
 18 | 
 19 | - translation reference and hypothesis
 20 | 
 21 | The required format of all inputs is raw text containing one sentence per line. Tokenisation is not necessary. 
 22 | 
 23 | In the case of multiple references, all available reference sentences must be separated by *#
 24 | 
 25 | 
 26 | Optional inputs:
 27 | ~~~~~~~~~~~~~~~~
 28 | 
 29 | -nc, --ncorder
 30 |   character n-gram order (default value is 6)
 31 | 
 32 | -nw, --nworder
 33 |   word n-gram order (default value is 2)
 34 | 
 35 | -b, --beta
 36 |   beta parameter to balance precision and recall (default value is 2)
 37 | 
 38 | 
 39 | Default outputs:
 40 | ++++++++++++++++
 41 | - start time
 42 | - overall document level F-score
 43 | - overal macro-averaged document level F-score (arithmetic average of the sentence level scores)
 44 | - end time
 45 | 
 46 | Optional outputs:
 47 | ~~~~~~~~~~~~~~~~~
 48 | 
 49 | -s, --sent
 50 |     sentence level scores
 51 | 
 52 | 
 53 | Examples for testing:
 54 | -------------------------------------- 
 55 | 
 56 | You can try the tool on the given examples containing distinct languages: English (en), Czech (cs), Russian (ru) and Chinese (zh). 
 57 | For each language, example.ref.land represents a reference and example.hyp.lang represents a hypothesis.
 58 | 
 59 | You can try various calls and compare the results:
 60 | 
 61 | 1) a simple call:
 62 | 
 63 | English:
 64 | 
 65 | chrF++.py -R example.ref.en -H example.hyp.en
 66 | 
 67 | start_time:	1497437792
 68 | c6+w2-F2	54.9482
 69 | c6+w2-avgF2	52.1829
 70 | end_time:	1497437792
 71 | 
 72 | Russian:
 73 | 
 74 | chrF++.py -R example.ref.ru -H example.hyp.ru
 75 | 
 76 | start_time:	1497437973
 77 | c6+w2-F2	42.2905
 78 | c6+w2-avgF2	42.6974
 79 | end_time:	1497437973
 80 | 
 81 | 
 82 | 
 83 | 2) changing default n-gram orders:
 84 | 
 85 | a) chrF++.py -R example.ref.en -H example.hyp.en -nc 8 -nw 1 
 86 | 
 87 | start_time:	1497438072
 88 | c8+w1-F2	52.7801
 89 | c8+w1-avgF2	49.7979
 90 | end_time:	1497438072
 91 | 
 92 | 
 93 | b) chrF++.py -R example.ref.en -H example.hyp.en nw 0 (uses only character n-grams -- recommended for Chinese and similar languages)
 94 | 
 95 | start_time:	1497438113
 96 | c6+w0-F2	58.0911
 97 | c6+w0-avgF2	55.1081
 98 | end_time:	1497438113
 99 | 
100 | 
101 | Chinese:
102 | 
103 | chrF++.py -R example.ref.zh -H example.hyp.zh -nw 0
104 | 
105 | start_time:	1497438131
106 | c6+w0-F2	32.6986
107 | c6+w0-avgF2	33.5167
108 | end_time:	1497438131
109 | 
110 | 
111 | 
112 | 3) changing beta parameter:
113 | 
114 | a) chrF++.py -R example.ref.en -H example.hyp.en -b 1 (equal contribution of precision and recall)
115 | 
116 | start_time:	1497438189
117 | c6+w2-F1	53.9267
118 | c6+w2-avgF1	50.9922
119 | end_time:	1497438189
120 | 
121 | 
122 | b) chrF++.py -R example.ref.en -H example.hyp.en -b 0.4 (more weight on precision)
123 | 
124 | start_time:	1497438211
125 | c6+w2-F0	52.7434
126 | c6+w2-avgF0	50.0280
127 | end_time:	1497438211
128 | 
129 | 
130 | 
131 | 4) sentence level scores:
132 | 
133 | chrF+.py -R example.ref.en -H example.hyp.en -s
134 | 
135 | start_time:	1497438336
136 | 1::c6+w2-F2	64.0368
137 | 2::c6+w2-F2	70.8799
138 | 3::c6+w2-F2	21.5461
139 | 4::c6+w2-F2	31.9252
140 | 5::c6+w2-F2	44.5054
141 | 6::c6+w2-F2	45.0953
142 | 7::c6+w2-F2	45.6882
143 | 8::c6+w2-F2	54.8102
144 | 9::c6+w2-F2	81.0330
145 | 10::c6+w2-F2	62.3084
146 | c6+w2-F2	54.9482
147 | c6+w2-avgF2	52.1829
148 | end_time:	1497438336
149 | 
150 | 


--------------------------------------------------------------------------------
/chrF++.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | 
  4 | # Copyright 2017 Maja Popovic
  5 | 
  6 | # The program is distributed under the terms 
  7 | # of the GNU General Public Licence (GPL)
  8 | 
  9 | # This program is distributed in the hope that it will be useful,
 10 | # but WITHOUT ANY WARRANTY; without even the implied warranty of
 11 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 12 | # GNU General Public License for more details.
 13 | 
 14 | # You should have received a copy of the GNU General Public License
 15 | # along with this program.  If not, see <http://www.gnu.org/licenses/>. 
 16 | 
 17 | 
 18 | # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 19 | # Publications of results obtained through the use of original or
 20 | # modified versions of the software have to cite the authors by refering
 21 | # to the following publication:
 22 | 
 23 | # Maja Popović (2015).
 24 | # "chrF: character n-gram F-score for automatic MT evaluation".
 25 | # In Proceedings of the Tenth Workshop on Statistical Machine Translation (WMT15), pages 392–395
 26 | # Lisbon, Portugal, September 2015.
 27 | 
 28 | # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 29 | 
 30 | import sys
 31 | import math
 32 | import unicodedata
 33 | import argparse
 34 | from collections import defaultdict
 35 | import time
 36 | import string
 37 | 
 38 | def separate_characters(line):
 39 |     return list(line.strip().replace(" ", ""))
 40 | 
 41 | def separate_punctuation(line):
 42 |     words = line.strip().split()
 43 |     tokenized = []
 44 |     for w in words:
 45 |         if len(w) == 1:
 46 |             tokenized.append(w)
 47 |         else:
 48 |             lastChar = w[-1] 
 49 |             firstChar = w[0]
 50 |             if lastChar in string.punctuation:
 51 |                 tokenized += [w[:-1], lastChar]
 52 |             elif firstChar in string.punctuation:
 53 |                 tokenized += [firstChar, w[1:]]
 54 |             else:
 55 |                 tokenized.append(w)
 56 |     
 57 |     return tokenized
 58 |     
 59 | def ngram_counts(wordList, order):
 60 |     counts = defaultdict(lambda: defaultdict(float))
 61 |     nWords = len(wordList)
 62 |     for i in range(nWords):
 63 |         for j in range(1, order+1):
 64 |             if i+j <= nWords:
 65 |                 ngram = tuple(wordList[i:i+j])
 66 |                 counts[j-1][ngram]+=1
 67 |    
 68 |     return counts
 69 | 
 70 | def ngram_matches(ref_ngrams, hyp_ngrams):
 71 |     matchingNgramCount = defaultdict(float)
 72 |     totalRefNgramCount = defaultdict(float)
 73 |     totalHypNgramCount = defaultdict(float)
 74 |  
 75 |     for order in ref_ngrams:
 76 |         for ngram in hyp_ngrams[order]:
 77 |             totalHypNgramCount[order] += hyp_ngrams[order][ngram]
 78 |         for ngram in ref_ngrams[order]:
 79 |             totalRefNgramCount[order] += ref_ngrams[order][ngram]
 80 |             if ngram in hyp_ngrams[order]:
 81 |                 matchingNgramCount[order] += min(ref_ngrams[order][ngram], hyp_ngrams[order][ngram])
 82 | 
 83 | 
 84 |     return matchingNgramCount, totalRefNgramCount, totalHypNgramCount
 85 | 
 86 | 
 87 | def ngram_precrecf(matching, reflen, hyplen, beta):
 88 |     ngramPrec = defaultdict(float)
 89 |     ngramRec = defaultdict(float)
 90 |     ngramF = defaultdict(float)
 91 |     
 92 |     factor = beta**2
 93 |     
 94 |     for order in matching:
 95 |         if hyplen[order] > 0:
 96 |             ngramPrec[order] = matching[order]/hyplen[order]
 97 |         else:
 98 |             ngramPrec[order] = 1e-16
 99 |         if reflen[order] > 0:
100 |             ngramRec[order] = matching[order]/reflen[order]
101 |         else:
102 |             ngramRec[order] = 1e-16
103 |         denom = factor*ngramPrec[order] + ngramRec[order]
104 |         if denom > 0:
105 |             ngramF[order] = (1+factor)*ngramPrec[order]*ngramRec[order] / denom
106 |         else:
107 |             ngramF[order] = 1e-16
108 |             
109 |     return ngramF, ngramRec, ngramPrec
110 | 
111 | def computeChrF(fpRef, fpHyp, nworder, ncorder, beta, sentence_level_scores = None):
112 |     norder = float(nworder + ncorder)
113 | 
114 |     # initialisation of document level scores
115 |     totalMatchingCount = defaultdict(float)
116 |     totalRefCount = defaultdict(float)
117 |     totalHypCount = defaultdict(float)
118 |     totalChrMatchingCount = defaultdict(float)
119 |     totalChrRefCount = defaultdict(float)
120 |     totalChrHypCount = defaultdict(float)
121 |     averageTotalF = 0.0
122 | 
123 |     nsent = 0
124 |     for hline, rline in zip(fpHyp, fpRef):
125 |         nsent += 1
126 |         
127 |         # preparation for multiple references
128 |         maxF = 0.0
129 |         bestWordMatchingCount = None
130 |         bestCharMatchingCount = None
131 |         
132 |         hypNgramCounts = ngram_counts(separate_punctuation(hline), nworder)
133 |         hypChrNgramCounts = ngram_counts(separate_characters(hline), ncorder)
134 | 
135 |         # going through multiple references
136 | 
137 |         refs = rline.split("*#")
138 | 
139 |         for ref in refs:
140 |             refNgramCounts = ngram_counts(separate_punctuation(ref), nworder)
141 |             refChrNgramCounts = ngram_counts(separate_characters(ref), ncorder)
142 | 
143 |             # number of overlapping n-grams, total number of ref n-grams, total number of hyp n-grams
144 |             matchingNgramCounts, totalRefNgramCount, totalHypNgramCount = ngram_matches(refNgramCounts, hypNgramCounts)
145 |             matchingChrNgramCounts, totalChrRefNgramCount, totalChrHypNgramCount = ngram_matches(refChrNgramCounts, hypChrNgramCounts)
146 |                     
147 |             # n-gram f-scores, recalls and precisions
148 |             ngramF, ngramRec, ngramPrec = ngram_precrecf(matchingNgramCounts, totalRefNgramCount, totalHypNgramCount, beta)
149 |             chrNgramF, chrNgramRec, chrNgramPrec = ngram_precrecf(matchingChrNgramCounts, totalChrRefNgramCount, totalChrHypNgramCount, beta)
150 | 
151 |             sentRec  = (sum(chrNgramRec.values())  + sum(ngramRec.values()))  / norder
152 |             sentPrec = (sum(chrNgramPrec.values()) + sum(ngramPrec.values())) / norder
153 |             sentF    = (sum(chrNgramF.values())    + sum(ngramF.values()))    / norder
154 | 
155 |             if sentF > maxF:
156 |                 maxF = sentF
157 |                 bestMatchingCount = matchingNgramCounts
158 |                 bestRefCount = totalRefNgramCount
159 |                 bestHypCount = totalHypNgramCount
160 |                 bestChrMatchingCount = matchingChrNgramCounts
161 |                 bestChrRefCount = totalChrRefNgramCount
162 |                 bestChrHypCount = totalChrHypNgramCount
163 |         # all the references are done
164 | 
165 | 
166 |         # write sentence level scores
167 |         if sentence_level_scores:
168 |             sentence_level_scores.write("%i::c%i+w%i-F%i\t%.4f\n"  % (nsent, ncorder, nworder, beta, 100*maxF))
169 | 
170 | 
171 |         # collect document level ngram counts
172 |         for order in range(nworder):
173 |             totalMatchingCount[order] += bestMatchingCount[order]
174 |             totalRefCount[order] += bestRefCount[order]
175 |             totalHypCount[order] += bestHypCount[order]
176 |         for order in range(ncorder):
177 |             totalChrMatchingCount[order] += bestChrMatchingCount[order]
178 |             totalChrRefCount[order] += bestChrRefCount[order]
179 |             totalChrHypCount[order] += bestChrHypCount[order]
180 | 
181 |         averageTotalF += maxF
182 | 
183 |     # all sentences are done
184 |      
185 |     # total precision, recall and F (aritmetic mean of all ngrams)
186 |     totalNgramF, totalNgramRec, totalNgramPrec = ngram_precrecf(totalMatchingCount, totalRefCount, totalHypCount, beta)
187 |     totalChrNgramF, totalChrNgramRec, totalChrNgramPrec = ngram_precrecf(totalChrMatchingCount, totalChrRefCount, totalChrHypCount, beta)
188 | 
189 |     totalF    = (sum(totalChrNgramF.values())    + sum(totalNgramF.values()))    / norder
190 |     averageTotalF = averageTotalF / nsent
191 |     totalRec  = (sum(totalChrNgramRec.values())  + sum(totalNgramRec.values()))  / norder
192 |     totalPrec = (sum(totalChrNgramPrec.values()) + sum(totalNgramPrec.values())) / norder
193 | 
194 |     return totalF, averageTotalF, totalPrec, totalRec
195 | 
196 | 
197 | def main():
198 |     sys.stdout.write("start_time:\t%i\n" % (time.time()))
199 | 
200 | 
201 |     argParser = argparse.ArgumentParser()
202 |     argParser.add_argument("-R", "--reference", help="reference translation", required=True)
203 |     argParser.add_argument("-H", "--hypothesis", help="hypothesis translation", required=True)
204 |     argParser.add_argument("-nc", "--ncorder", help="character n-gram order (default=6)", type=int, default=6)
205 |     argParser.add_argument("-nw", "--nworder", help="word n-gram order (default=2)", type=int, default=2)
206 |     argParser.add_argument("-b", "--beta", help="beta parameter (default=2)", type=float, default=2.0)
207 |     argParser.add_argument("-s", "--sent", help="show sentence level scores", action="store_true")
208 | 
209 |     args = argParser.parse_args()
210 | 
211 |     rtxt = open(args.reference, 'r')
212 |     htxt = open(args.hypothesis, 'r')
213 | 
214 |     sentence_level_scores = None
215 |     if args.sent:
216 |         sentence_level_scores = sys.stdout # Or stderr?
217 | 
218 |     totalF, averageTotalF, totalPrec, totalRec = computeChrF(rtxt, htxt, args.nworder, args.ncorder, args.beta, sentence_level_scores)
219 | 
220 |     sys.stdout.write("c%i+w%i-F%i\t%.4f\n"  % (args.ncorder, args.nworder, args.beta, 100*totalF))
221 |     sys.stdout.write("c%i+w%i-avgF%i\t%.4f\n"  % (args.ncorder, args.nworder, args.beta, 100*averageTotalF))
222 |     #sys.stdout.write("c%i+w%i-Prec\t%.4f\n" % (args.ncorder, args.nworder, 100*totalPrec))
223 |     #sys.stdout.write("c%i+w%i-Rec\t%.4f\n"  % (args.ncorder, args.nworder, 100*totalRec))
224 | 
225 |     sys.stdout.write("end_time:\t%i\n" % (time.time()))
226 | 
227 |     htxt.close()
228 |     rtxt.close()
229 | 
230 | 
231 | if __name__ == "__main__":
232 |     main()
233 | 


--------------------------------------------------------------------------------