CALL FOR PAPER OFFERINGS for LREC Shared Task on Reproducibility

CALL FOR PAPER OFFERINGS for LREC Shared Task on Reproducibility

Did you author a recent paper in the area of Natural Language
Processing and Computational Linguistics, and are you interested in
further visibility of your paper, and wondering how well other
researchers are able to reproduce your results?

Please consider offering your paper for the upcoming
Shared Task on Reproducibility!


Scientific knowledge is grounded on falsifiable predictions and thus
its credibility and raison d’être relies on the possibility of
repeating experiments and getting similar results as originally
obtained and reported. In many young scientific areas, including
ours, acknowledgment and promotion of reproduction of
results need very much to be increased.
Continue reading “CALL FOR PAPER OFFERINGS for LREC Shared Task on Reproducibility”

Fwd: H2020-funded Post-doc in Neural Machine Translation, Tartu, Estonia

Hi all,

We are looking for a post-doc in neural machine translation as part of the Horizon2020 ICT RIA project “Bergamot: Browser-based Multilingual Translation“. If you have recently defended a PhD thesis (or are close to defending) and have experience with deep learning methods in machine translation and other sequence-to-sequence NLP, then this is your chance to work in close collaboration with several internationally renowned research groups and grow as an independent researcher.

The aim of the Bergamot project is to allow NMT engines to run offline in a web browser, while dynamically adapting to the user’s requirements, content and the surrounding context, as well as yielding controllable output quality. As part of the position you will work on faster and lighter multi-domain and multilingual NMT approaches that can adapt to various kinds of context info.
Continue reading “Fwd: H2020-funded Post-doc in Neural Machine Translation, Tartu, Estonia”

ParaCrawl corpus release v3.0

The third version of the ParaCrawl corpus has been released! Parallel corpus is now available in all 24 of the official EU languages. 6 new languages are added to the v3 release namely Bulgarian, Danish, Greek, Slovak, Slovenian and Swedish. For the previously released languages, more data is added to the corpus.
For each language, two different versions of the corpus are released based on two cleaning tools, i.e. BiCleaner and Zipporah. ParaCrawl corpus is crawled from a large number of websites. The selection of websites is based on CommonCrawl, but ParaCrawl is extracted from a brand new crawl which has much higher coverage of these selected websites than CommonCrawl.
Continue reading “ParaCrawl corpus release v3.0”



Continue reading “实验室史晓东教授和两名博士生赴安阳参加第一届甲骨文信息处理国际学术研讨会”

北京大学计算机科学技术研究所万小军博士及腾讯AI Lab高级研究员李菁应邀来校开展学术讲座

2018年11月21日19:10-21:30于海韵教学楼304,应史晓东教授邀请,北京大学计算机科学研究所研究员万小军教授及腾讯AI Lab高级研究员李菁来校开展学术讲座,主题分别为《机器写作技术与应用》与《社交文本的理解》。
万小军教授围绕机器写作展开讨论,首先就任务、具体应用、资源与模型等方面对该领域进行概述,再从其团队的研究课题切入,通过Data2Text、Pun Generation、Images2Poem三个研究项目详细介绍机器写作的具体过程与实际应用,最后指出机器写作所面临的挑战及可能的应对方法,讲座内容丰富详实,深入浅出。

Continue reading “北京大学计算机科学技术研究所万小军博士及腾讯AI Lab高级研究员李菁应邀来校开展学术讲座”

1st Call for Workshop Proposals: Machine Translation Summit 2019

1st Call for Workshop Proposals: Machine Translation Summit 2019

Dublin, Ireland

Machine Translation Summit workshops are intended to provide the
opportunity for MT-related communities of interest to spend time
together advancing the state of thinking or the state of practice in
their area of endeavour. We are particularly interested in submissions
related to commercialisation of MT and/or its use by translators.
However, any themes connected to MT research, development, deployment,
use, and evaluation are welcome. Topics for past workshops have included
post-editing, translation of patents, monolingual aspects of MT,
collaborative translation, lexical resources, and MT research and the
translation industry, to mention just a few.
Continue reading “1st Call for Workshop Proposals: Machine Translation Summit 2019”