麻省理工学院( MIT )大脑与认知科学系的弗特勒尔等人对 37 种自然语言的进行了基于语料库的大数据分析,为人类语言中的依存距离最小化倾向提供了最新的实证依据,在科学研究领域引起广泛关注。 MIT 这项研究可能是迄今为止含有语言最多的依存距离研究,在普遍性的验证上较先前研究更为全面,在方法上也有自身的一些特点。但其结论与看法有多处值得商榷。早在八年前,浙江大学刘海涛团队对依存距离最小化倾向进行了较为深入的研究,推动了人们了解人类语言如何在普遍认知机制制约下形成与演化,也帮助人们更深入地认识人类的认知机制。这些研究充分体现了语言研究是一个具有交叉学科特点的研究领域,是多语言、大数据技术、语言的普遍规律、认知科学充分融合的学科。这四大因素结合起来的研究将会展现当代科学研究的光芒,并将对当代语言科学的发展起到重要的推动作用。
For decades, dependency distance/length minimization (DDM) has been pursued as a universal underlying force shaping human languages. In the early edition of PNAS, Futrell, et al. suggest that dependency length minimization is a universal property of human languages and hence supports explanations of linguistic variation in terms of general properties of human information processing. This research may be the very first effort which surveys the largest scale as many as 37 natural languages, and immediately draws great attention worldwide. However, questions still remain in their research, since dependency distance can be sensitive to many factors. Also in this line, eight years before, Prof. Liu Haitao of Zhejiang University has compared dependency distance of 20 natural languages with that of two different random languages, and pointed out that dependency distance minimization is probably universal in human languages. Altogether, these researches into DDM in human languages reveal that it is valuable to cognitively investigate linguistic universals through statistical analysis of big-language-data, thus suggesting that, to obtain truly scientific discoveries, it may well be essential for linguistic study to integrate efforts from multiple disciplines——cross-language analysis, big-data mining, language universals, and cognitive science.