├── example.ref.zh ├── example.hyp.zh ├── example.hyp.cs ├── example.ref.en ├── example.ref.cs ├── example.hyp.en ├── example.ref.ru ├── example.hyp.ru ├── README.md └── chrF++.py /example.ref.zh: -------------------------------------------------------------------------------- 1 | 市社团办相关负责人介绍,本次检查范围包括2013至2015年度在市社团办领用社会团体会费统一收据的社会团体。 2 | 里约奥组委在2012年制定的奥运传播战略中就曾说:“奥运会成功与否是由哪些在社交媒体分享故事的人们定义的。” 3 | 而从本届奥运会的报道来看,各媒体在传播方式、产品形态、语言风格、新闻视角等诸多方面都较以往做了很大的改变。 4 | 国际奥委会和世界各国的媒体从业者,看来都已经要迎接这一轮挑战了。 5 | 8月11日,光大证券H股公开发售进入最后一天。有消息指出,招股前两日市场反应平淡。按照计划,其将于8月18日在港交所挂牌上市。 6 | 国内一家券商分析师表示,东方证券在港基础并不算好,旗下资管业务才是东证所长。相比中银国际、中信证券等银行系券商,显得先天不足。 7 | 专家指出,今年以来随着证券市场回归平淡,很多券商业绩都出现了下滑,但公司上市之后情况要更多看公司本身业务的发展情况,不能一概而论。 8 | 此项调控新政出台后,苏州将成为全国第一个重启限购的二线城市。 9 | 出台或者即将出台楼市调控政策的二线城市不止苏州,还包括合肥、南京等城市。 10 | 此前,合肥已经率先发布限贷政策,对于在合肥名下有两套房且有一套住房贷款未结清的购房者,银行将拒绝提供房贷服务。 11 | -------------------------------------------------------------------------------- /example.hyp.zh: -------------------------------------------------------------------------------- 1 | 市政协会的有关负责人说,这次检查的范围包括在2013年至2015年期间在社会群众组织中接受社会团体收费统一收据的社会团体。 2 | “奥运会的成功是由那些在社交媒体上分享故事的人所确定的。”里约奥运会组委会在确定2012年奥运会的传播战略时表示。 3 | 从奥运会的新闻报道来看,所有媒体在传递方式、产品形式、语言风格、新闻视角和许多其他方面都做出了很多变化。 4 | 看来,国际奥委会和世界各地的媒体工作者都必须迎接这一轮挑战。 5 | 8月11日,这是光大证券 H股公开发售的最后一天。消息说,前两天市场反应并不特别。根据计划,将于8月18日在香港交易所上市。 6 | 一位国内券商分析师说,香港的东方证券没有很好的基础。只有其资本管理业务才是它的实力。与中国银行国际和中信证券等相比,它有许多内在的问题。 7 | 专家指出,随着证券市场回归扁平化,许多券商今年的业绩出现了下滑。但上市公司的详细情况取决于公司业务的详细发展,不能一概而论。 8 | 在新协议出台后,苏州将成为第一个重启购买限制的二级城市。 9 | 除苏州外,包括合肥、南京在内的其他二线城市也将出台房地产市场调控政策。 10 | 在这之前,合肥已率先发布了对贷款政策的限制。对于在合肥有两个套房并有一个房屋贷款未支付的人,他们将被拒绝从银行获得抵押贷款服务。 11 | -------------------------------------------------------------------------------- /example.hyp.cs: -------------------------------------------------------------------------------- 1 | Nedávné nabídky evakuace tvoří režim a Rusko znělo jako slabě zahalené výhrůžky, uvedli chirurgové pastevci a další lékaři. 2 | Lékaři uvedli, že minulý měsíc došlo k 42 útokům na zdravotnická zařízení v Sýrii, 15 z nich byly nemocnice, ve kterých pracují. 3 | Co nás nejvíc trápí, protože lékaři si vybírají, kdo bude žít a kdo zemře. 4 | Malé děti jsou někdy přivedeny do našich nouzových pokojů tak těžce raněných, že musíme upřednostnit ty, kteří mají lepší šance, 5 | Nebo prostě nemají vybavení, které by jim pomohlo, uvedli lékaři. 6 | Před dvěma týdny čtyři novorozenci, kteří lapali po dechu, se udusili k smrti poté, co výbuch snížil přísun kyslíku do jejich inkubátorů. 7 | Lapali po dechu, jejich životy skončily dřív, než skutečně začaly. 8 | V posledních týdnech síly věrné Bašáru Asadovi podporované ruskými silami ovládly povstalecké východní Aleppo s přes 200 000 lidí stále uvězněných uvnitř bez dodávek potravin. 9 | Povstalcům, kteří jsou pod neustálým útokem ruského a syrského letectva, se v sobotu podařilo prolomit obklíčení a zahájit protiofenzivu. 10 | Obě strany přidaly posily jak ve městě, tak v okolních oblastech. 11 | -------------------------------------------------------------------------------- /example.ref.en: -------------------------------------------------------------------------------- 1 | Recent offers of evacuation form the regime and Russia had sounded like thinly-veiled threats, said the surgeons paediatricians and other doctors. 2 | Doctors said that last month, there were 42 attacks on medical facilities in Syria, 15 of which were hospitals in which they work. 3 | What pains us most, as doctors, is choosing who will live and who will die. 4 | Young children are sometimes brought into our emergency rooms so badly injured that we have to prioritise those with better chances, 5 | or simply do not have the equipment to help them, said the doctors. 6 | Two weeks ago, four newborn babies, gasping for air suffocated to death after a blast cut the oxygen supply to their incubators. 7 | Gasping for air, their lives ended before they had really begun. 8 | In recent weeks, forces loyal to Bashar al-Assad supported by Russian forces have seized control of the rebel-held east Aleppo with over 200,000 people still trapped inside with no food supplies. 9 | The rebels, who are under constant attack by Russian and Syrian air force, managed to break through the siege on Saturday and launch a counteroffensive. 10 | Both sides have added reinforcements both in the city and the surrounding areas. 11 | -------------------------------------------------------------------------------- /example.ref.cs: -------------------------------------------------------------------------------- 1 | Nedávné nabídky evakuace obyvatel od syrského režimu a Ruska zní jako jen lehce zastřené výhrůžky, podotkli pediatři, chirurgové a další lékaři. 2 | Lékaři uvedli, že za uplynulý měsíc bylo zaznamenáno 42 útoků na lékařská zařízení v Sýrii, z toho 15 na nemocnice, ve kterých pracují. 3 | Nejvíce smutní jsme z toho, že musíme rozhodovat o tom, kdo bude žít a kdo zemře. 4 | Na pohotovost k nám přicházejí malé děti s tak vážnými zraněními, že musíme mezi nimi vybírat ty, u nichž je největší pravděpodobnost, že přežijí. 5 | A někdy nemáme ani potřebný materiál, abychom jim pomohli, popsali lékaři. 6 | Před dvěma týdny se udusili čtyři novorozenci, když výbuch přerušil dodávky kyslíku do jejich inkubátorů. 7 | Lapali po dechu a pak jejich život skončil - dřív, než skutečně mohl začít, připomněli. 8 | Armáda věrná prezidentovi Bašáru Asadovi, kterou podporují ruské síly, v minulých týdnech obklíčila povstalci ovládanou východní část Halabu, uvnitř se ocitlo bez dodávek potravin přes 200.000 lidí. 9 | Rebelům, kteří jsou pod neustálými údery ruského a syrského letectva, se v sobotu po třech týdnech podařilo obklíčení prolomit a zahájit protiofenzivu. 10 | Oběma stranám poté přibyly ve městě a jeho okolí posily. 11 | -------------------------------------------------------------------------------- /example.hyp.en: -------------------------------------------------------------------------------- 1 | Recent offers of evacuating residents from the Syrian regime and Russia sound like only thinly veiled threats, pediatricians, surgeons and other doctors have said. 2 | Doctors said there had been 42 attacks on medical facilities in Syria over the past month, from 15 to the hospitals in which they work. 3 | The most sad is that we have to decide who lives and who dies. 4 | There are young children coming to the ER with such serious injuries that we have to pick among them the ones most likely to survive. 5 | And sometimes we don't even have the necessary material to help them, the doctors described. 6 | Four newborns were suffocated two weeks ago when an explosion interrupted the supply of oxygen to their incubators. 7 | They were gasping for breath, and then their lives were over - before they could actually begin, they recalled. 8 | An army loyal to President Bashar al-Assad, backed by Russian forces, has besieged the rebel-controlled eastern part of Halaba in past weeks, inside with no food supplies of over 200,000 people. 9 | The rebels, who are under constant strikes by the Russian and Syrian air force, managed to break the siege on Saturday after three weeks and launch a counter-offensive. 10 | Both sides then had reinforcements in the city and its surrounding area. 11 | -------------------------------------------------------------------------------- /example.ref.ru: -------------------------------------------------------------------------------- 1 | G-Power известна своим пристрастием к автомобилям BMW, но ничто человеческое ее мастерам не чуждо. 2 | Действительно, почему бы не усовершенствовать G 63 - 571-сильную летающую кувалду с аэродинамикой танка и разгоном до сотни за 5,4 с? 3 | Накачивая "мускулы" сумасшедшего "гелика", мастера воспользовались преимуществами, которые открывает 5,5-литровая наддувная "восьмерка". 4 | А вот динамика разгона до сотни улучшилась на неразличимые 0,1 с, и теперь летающий "кирпич" способен на рывок за 5,3 с. 5 | Впрочем, ускорения со средних скоростей наверняка даются ему заметно лучше. 6 | Потолок скорости устрашает: если стандартный G 63 быстрее 210 км/ч не ездит, то тюнинговый вседорожник с деактивированным электронным "ошейником" способен выжать 250 км/ч. 7 | Снаружи автомобиль практически неотличим от своего заводского первоисточника, за исключением других колесных дисков - 23-дюймовых Hurricane RR размерностью 305/35 ZR 23. 8 | Люксовое подразделение Mercedes-Maybach анонсировало поразительное концептуальное купе длиной 6 м, которое представят в рамках Конкурса элегантности в Пеббл-Бич. 9 | Компания Mercedes-Benz собирается конкурировать с "зеленым суббрендом" BMW i и будет продвигать свои электрические модели под отдельной торговой маркой. 10 | В ближайшие годы ожидается появление минимум четырех экологичных новинок. 11 | -------------------------------------------------------------------------------- /example.hyp.ru: -------------------------------------------------------------------------------- 1 | "G-Power" известна своей простотой для автомобилей BMW, но в конце концов, к своим экспертам, транспортное средство является транспортным средством. 2 | Действительно, почему бы не улучшить G 63: 571-л.с. летающий санузел с аэродинамикой цистерны, который идет от нуля до сотни в 5,4 секунды? 3 | Чтобы выгнать "мышцы" этого сумасшедшего "Гелика", эксперты воспользовались выгодами, которые предлагает двигатель с сверхзаряженным двигателем V8. 4 | Ну, скорость, при которой она ускоряется от нуля до сотни, улучшенная незаметно 0,1 секунды: Теперь летающий "кирпич" способен ударить по 100 в 5,3 секунды. 5 | Ум, ускорение при средних скоростях, наверняка, заметно легче для него. 6 | Максимальная скорость ужасает: В то время как стандарт G 63 не может идти быстрее, чем 210 км / ч, SUV, с отключенным электронным "воротником", способен ударить по 250 км / ч. 7 | Снаружи транспортное средство практически не отличается от заводского оригинала, за исключением различных колец колес: 23-дюймовый ураган RR с 305 / 35ZR23. 8 | Роскошный дивизион, Mercedes-Maybach, объявил поразительный, 6-метровый концепт-купе, который будет представлен в Конкурсе d 'Elegance в Pebble Beach. 9 | Mercedes-Benz намерен конкурировать с "зеленым суббрендом", BMW i, и будет продвигать свои электрические модели под отдельным брендом. 10 | В ближайшие годы ожидается дебют по крайней мере четырех новых экологически безопасных моделей. 11 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # chrF 2 | a tool for calcualting character n-gram F score 3 | 4 | By: Maja Popovic , June 2017 5 | 6 | 7 | chrF++ is a tool for automatic evaluation of machine translation output based on character n-gram precision and recall enhanced with word n-grams. 8 | The tool calculates the F-score averaged on all character and word n-grams, where the default character n-gram order is 6 and word n-gram order is 2. The arithmetic mean is used for n-gram averaging. 9 | 10 | Recent experiments have shown that adding word 1-grams and 2-grams to the standard character 6-grams improves the Pearson correlation with direct human assessments. If you want to use only character n-grams, just set the word n-gram order to 0. 11 | 12 | It is written in Python, so you have to install Python 2 or Python 3. 13 | The option -h, --help outputs a description of the available command line options. 14 | 15 | 16 | Required inputs: 17 | ++++++++++++++++ 18 | 19 | - translation reference and hypothesis 20 | 21 | The required format of all inputs is raw text containing one sentence per line. Tokenisation is not necessary. 22 | 23 | In the case of multiple references, all available reference sentences must be separated by *# 24 | 25 | 26 | Optional inputs: 27 | ~~~~~~~~~~~~~~~~ 28 | 29 | -nc, --ncorder 30 | character n-gram order (default value is 6) 31 | 32 | -nw, --nworder 33 | word n-gram order (default value is 2) 34 | 35 | -b, --beta 36 | beta parameter to balance precision and recall (default value is 2) 37 | 38 | 39 | Default outputs: 40 | ++++++++++++++++ 41 | - start time 42 | - overall document level F-score 43 | - overal macro-averaged document level F-score (arithmetic average of the sentence level scores) 44 | - end time 45 | 46 | Optional outputs: 47 | ~~~~~~~~~~~~~~~~~ 48 | 49 | -s, --sent 50 | sentence level scores 51 | 52 | 53 | Examples for testing: 54 | -------------------------------------- 55 | 56 | You can try the tool on the given examples containing distinct languages: English (en), Czech (cs), Russian (ru) and Chinese (zh). 57 | For each language, example.ref.land represents a reference and example.hyp.lang represents a hypothesis. 58 | 59 | You can try various calls and compare the results: 60 | 61 | 1) a simple call: 62 | 63 | English: 64 | 65 | chrF++.py -R example.ref.en -H example.hyp.en 66 | 67 | start_time: 1497437792 68 | c6+w2-F2 54.9482 69 | c6+w2-avgF2 52.1829 70 | end_time: 1497437792 71 | 72 | Russian: 73 | 74 | chrF++.py -R example.ref.ru -H example.hyp.ru 75 | 76 | start_time: 1497437973 77 | c6+w2-F2 42.2905 78 | c6+w2-avgF2 42.6974 79 | end_time: 1497437973 80 | 81 | 82 | 83 | 2) changing default n-gram orders: 84 | 85 | a) chrF++.py -R example.ref.en -H example.hyp.en -nc 8 -nw 1 86 | 87 | start_time: 1497438072 88 | c8+w1-F2 52.7801 89 | c8+w1-avgF2 49.7979 90 | end_time: 1497438072 91 | 92 | 93 | b) chrF++.py -R example.ref.en -H example.hyp.en nw 0 (uses only character n-grams -- recommended for Chinese and similar languages) 94 | 95 | start_time: 1497438113 96 | c6+w0-F2 58.0911 97 | c6+w0-avgF2 55.1081 98 | end_time: 1497438113 99 | 100 | 101 | Chinese: 102 | 103 | chrF++.py -R example.ref.zh -H example.hyp.zh -nw 0 104 | 105 | start_time: 1497438131 106 | c6+w0-F2 32.6986 107 | c6+w0-avgF2 33.5167 108 | end_time: 1497438131 109 | 110 | 111 | 112 | 3) changing beta parameter: 113 | 114 | a) chrF++.py -R example.ref.en -H example.hyp.en -b 1 (equal contribution of precision and recall) 115 | 116 | start_time: 1497438189 117 | c6+w2-F1 53.9267 118 | c6+w2-avgF1 50.9922 119 | end_time: 1497438189 120 | 121 | 122 | b) chrF++.py -R example.ref.en -H example.hyp.en -b 0.4 (more weight on precision) 123 | 124 | start_time: 1497438211 125 | c6+w2-F0 52.7434 126 | c6+w2-avgF0 50.0280 127 | end_time: 1497438211 128 | 129 | 130 | 131 | 4) sentence level scores: 132 | 133 | chrF+.py -R example.ref.en -H example.hyp.en -s 134 | 135 | start_time: 1497438336 136 | 1::c6+w2-F2 64.0368 137 | 2::c6+w2-F2 70.8799 138 | 3::c6+w2-F2 21.5461 139 | 4::c6+w2-F2 31.9252 140 | 5::c6+w2-F2 44.5054 141 | 6::c6+w2-F2 45.0953 142 | 7::c6+w2-F2 45.6882 143 | 8::c6+w2-F2 54.8102 144 | 9::c6+w2-F2 81.0330 145 | 10::c6+w2-F2 62.3084 146 | c6+w2-F2 54.9482 147 | c6+w2-avgF2 52.1829 148 | end_time: 1497438336 149 | 150 | -------------------------------------------------------------------------------- /chrF++.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | # Copyright 2017 Maja Popovic 5 | 6 | # The program is distributed under the terms 7 | # of the GNU General Public Licence (GPL) 8 | 9 | # This program is distributed in the hope that it will be useful, 10 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 12 | # GNU General Public License for more details. 13 | 14 | # You should have received a copy of the GNU General Public License 15 | # along with this program. If not, see . 16 | 17 | 18 | # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 19 | # Publications of results obtained through the use of original or 20 | # modified versions of the software have to cite the authors by refering 21 | # to the following publication: 22 | 23 | # Maja Popović (2015). 24 | # "chrF: character n-gram F-score for automatic MT evaluation". 25 | # In Proceedings of the Tenth Workshop on Statistical Machine Translation (WMT15), pages 392–395 26 | # Lisbon, Portugal, September 2015. 27 | 28 | # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 29 | 30 | import sys 31 | import math 32 | import unicodedata 33 | import argparse 34 | from collections import defaultdict 35 | import time 36 | import string 37 | 38 | def separate_characters(line): 39 | return list(line.strip().replace(" ", "")) 40 | 41 | def separate_punctuation(line): 42 | words = line.strip().split() 43 | tokenized = [] 44 | for w in words: 45 | if len(w) == 1: 46 | tokenized.append(w) 47 | else: 48 | lastChar = w[-1] 49 | firstChar = w[0] 50 | if lastChar in string.punctuation: 51 | tokenized += [w[:-1], lastChar] 52 | elif firstChar in string.punctuation: 53 | tokenized += [firstChar, w[1:]] 54 | else: 55 | tokenized.append(w) 56 | 57 | return tokenized 58 | 59 | def ngram_counts(wordList, order): 60 | counts = defaultdict(lambda: defaultdict(float)) 61 | nWords = len(wordList) 62 | for i in range(nWords): 63 | for j in range(1, order+1): 64 | if i+j <= nWords: 65 | ngram = tuple(wordList[i:i+j]) 66 | counts[j-1][ngram]+=1 67 | 68 | return counts 69 | 70 | def ngram_matches(ref_ngrams, hyp_ngrams): 71 | matchingNgramCount = defaultdict(float) 72 | totalRefNgramCount = defaultdict(float) 73 | totalHypNgramCount = defaultdict(float) 74 | 75 | for order in ref_ngrams: 76 | for ngram in hyp_ngrams[order]: 77 | totalHypNgramCount[order] += hyp_ngrams[order][ngram] 78 | for ngram in ref_ngrams[order]: 79 | totalRefNgramCount[order] += ref_ngrams[order][ngram] 80 | if ngram in hyp_ngrams[order]: 81 | matchingNgramCount[order] += min(ref_ngrams[order][ngram], hyp_ngrams[order][ngram]) 82 | 83 | 84 | return matchingNgramCount, totalRefNgramCount, totalHypNgramCount 85 | 86 | 87 | def ngram_precrecf(matching, reflen, hyplen, beta): 88 | ngramPrec = defaultdict(float) 89 | ngramRec = defaultdict(float) 90 | ngramF = defaultdict(float) 91 | 92 | factor = beta**2 93 | 94 | for order in matching: 95 | if hyplen[order] > 0: 96 | ngramPrec[order] = matching[order]/hyplen[order] 97 | else: 98 | ngramPrec[order] = 1e-16 99 | if reflen[order] > 0: 100 | ngramRec[order] = matching[order]/reflen[order] 101 | else: 102 | ngramRec[order] = 1e-16 103 | denom = factor*ngramPrec[order] + ngramRec[order] 104 | if denom > 0: 105 | ngramF[order] = (1+factor)*ngramPrec[order]*ngramRec[order] / denom 106 | else: 107 | ngramF[order] = 1e-16 108 | 109 | return ngramF, ngramRec, ngramPrec 110 | 111 | def computeChrF(fpRef, fpHyp, nworder, ncorder, beta, sentence_level_scores = None): 112 | norder = float(nworder + ncorder) 113 | 114 | # initialisation of document level scores 115 | totalMatchingCount = defaultdict(float) 116 | totalRefCount = defaultdict(float) 117 | totalHypCount = defaultdict(float) 118 | totalChrMatchingCount = defaultdict(float) 119 | totalChrRefCount = defaultdict(float) 120 | totalChrHypCount = defaultdict(float) 121 | averageTotalF = 0.0 122 | 123 | nsent = 0 124 | for hline, rline in zip(fpHyp, fpRef): 125 | nsent += 1 126 | 127 | # preparation for multiple references 128 | maxF = 0.0 129 | bestWordMatchingCount = None 130 | bestCharMatchingCount = None 131 | 132 | hypNgramCounts = ngram_counts(separate_punctuation(hline), nworder) 133 | hypChrNgramCounts = ngram_counts(separate_characters(hline), ncorder) 134 | 135 | # going through multiple references 136 | 137 | refs = rline.split("*#") 138 | 139 | for ref in refs: 140 | refNgramCounts = ngram_counts(separate_punctuation(ref), nworder) 141 | refChrNgramCounts = ngram_counts(separate_characters(ref), ncorder) 142 | 143 | # number of overlapping n-grams, total number of ref n-grams, total number of hyp n-grams 144 | matchingNgramCounts, totalRefNgramCount, totalHypNgramCount = ngram_matches(refNgramCounts, hypNgramCounts) 145 | matchingChrNgramCounts, totalChrRefNgramCount, totalChrHypNgramCount = ngram_matches(refChrNgramCounts, hypChrNgramCounts) 146 | 147 | # n-gram f-scores, recalls and precisions 148 | ngramF, ngramRec, ngramPrec = ngram_precrecf(matchingNgramCounts, totalRefNgramCount, totalHypNgramCount, beta) 149 | chrNgramF, chrNgramRec, chrNgramPrec = ngram_precrecf(matchingChrNgramCounts, totalChrRefNgramCount, totalChrHypNgramCount, beta) 150 | 151 | sentRec = (sum(chrNgramRec.values()) + sum(ngramRec.values())) / norder 152 | sentPrec = (sum(chrNgramPrec.values()) + sum(ngramPrec.values())) / norder 153 | sentF = (sum(chrNgramF.values()) + sum(ngramF.values())) / norder 154 | 155 | if sentF > maxF: 156 | maxF = sentF 157 | bestMatchingCount = matchingNgramCounts 158 | bestRefCount = totalRefNgramCount 159 | bestHypCount = totalHypNgramCount 160 | bestChrMatchingCount = matchingChrNgramCounts 161 | bestChrRefCount = totalChrRefNgramCount 162 | bestChrHypCount = totalChrHypNgramCount 163 | # all the references are done 164 | 165 | 166 | # write sentence level scores 167 | if sentence_level_scores: 168 | sentence_level_scores.write("%i::c%i+w%i-F%i\t%.4f\n" % (nsent, ncorder, nworder, beta, 100*maxF)) 169 | 170 | 171 | # collect document level ngram counts 172 | for order in range(nworder): 173 | totalMatchingCount[order] += bestMatchingCount[order] 174 | totalRefCount[order] += bestRefCount[order] 175 | totalHypCount[order] += bestHypCount[order] 176 | for order in range(ncorder): 177 | totalChrMatchingCount[order] += bestChrMatchingCount[order] 178 | totalChrRefCount[order] += bestChrRefCount[order] 179 | totalChrHypCount[order] += bestChrHypCount[order] 180 | 181 | averageTotalF += maxF 182 | 183 | # all sentences are done 184 | 185 | # total precision, recall and F (aritmetic mean of all ngrams) 186 | totalNgramF, totalNgramRec, totalNgramPrec = ngram_precrecf(totalMatchingCount, totalRefCount, totalHypCount, beta) 187 | totalChrNgramF, totalChrNgramRec, totalChrNgramPrec = ngram_precrecf(totalChrMatchingCount, totalChrRefCount, totalChrHypCount, beta) 188 | 189 | totalF = (sum(totalChrNgramF.values()) + sum(totalNgramF.values())) / norder 190 | averageTotalF = averageTotalF / nsent 191 | totalRec = (sum(totalChrNgramRec.values()) + sum(totalNgramRec.values())) / norder 192 | totalPrec = (sum(totalChrNgramPrec.values()) + sum(totalNgramPrec.values())) / norder 193 | 194 | return totalF, averageTotalF, totalPrec, totalRec 195 | 196 | 197 | def main(): 198 | sys.stdout.write("start_time:\t%i\n" % (time.time())) 199 | 200 | 201 | argParser = argparse.ArgumentParser() 202 | argParser.add_argument("-R", "--reference", help="reference translation", required=True) 203 | argParser.add_argument("-H", "--hypothesis", help="hypothesis translation", required=True) 204 | argParser.add_argument("-nc", "--ncorder", help="character n-gram order (default=6)", type=int, default=6) 205 | argParser.add_argument("-nw", "--nworder", help="word n-gram order (default=2)", type=int, default=2) 206 | argParser.add_argument("-b", "--beta", help="beta parameter (default=2)", type=float, default=2.0) 207 | argParser.add_argument("-s", "--sent", help="show sentence level scores", action="store_true") 208 | 209 | args = argParser.parse_args() 210 | 211 | rtxt = open(args.reference, 'r') 212 | htxt = open(args.hypothesis, 'r') 213 | 214 | sentence_level_scores = None 215 | if args.sent: 216 | sentence_level_scores = sys.stdout # Or stderr? 217 | 218 | totalF, averageTotalF, totalPrec, totalRec = computeChrF(rtxt, htxt, args.nworder, args.ncorder, args.beta, sentence_level_scores) 219 | 220 | sys.stdout.write("c%i+w%i-F%i\t%.4f\n" % (args.ncorder, args.nworder, args.beta, 100*totalF)) 221 | sys.stdout.write("c%i+w%i-avgF%i\t%.4f\n" % (args.ncorder, args.nworder, args.beta, 100*averageTotalF)) 222 | #sys.stdout.write("c%i+w%i-Prec\t%.4f\n" % (args.ncorder, args.nworder, 100*totalPrec)) 223 | #sys.stdout.write("c%i+w%i-Rec\t%.4f\n" % (args.ncorder, args.nworder, 100*totalRec)) 224 | 225 | sys.stdout.write("end_time:\t%i\n" % (time.time())) 226 | 227 | htxt.close() 228 | rtxt.close() 229 | 230 | 231 | if __name__ == "__main__": 232 | main() 233 | --------------------------------------------------------------------------------