Abstract Understanding hominin behaviors is an essential goal of Palaeolithic archaeology. As places where hominins used to live, archaeological sites are vital for identifying traces of hominin activities and reconstructing hominins' life in the past. Therefore, archaeological sites have been analyzed from various perspectives, among which site structure analysis is believed to be one of the best ways. The site structure of archaeological sites is closely related to the function of artefacts found in situ, for the cluster of artefacts with similar functions indicates potential spatial differentiation of tasks. In Paleolithic archaeology, use-wear analysis has been widely used to recognize the function of artefacts used by hominins, especially lithic tools. However, notwithstanding its wide application and powerful potential, use-wear analysis can only provide function information of each piece of stone artefact, failing to tell the relationships among the assemblage. Thus, the method of data mining is introduced in this paper to fill this gap. Data mining is the process of extracting hidden and unknown information from a large amount of data. Following its previous successful applications on the study of intra-site spatial patterns of Neolithic settlements, this paper will use data mining approaches to cope with the huge amount of data acquired from use-wear analysis on lithic tools, using association rules and cluster analysis to identify the possible correlation among lithic tools. Data from the Loc. 1 of Wulanmulun sites are analyzed in this paper. Wulanmulun site is an important Middle Pleistocene archaeological site located in the northern bank of Wulanmulun River, Ordos, Inner Mongolia. After years of excavation, three locations have been found and abundant hominin remains have been uncovered, among which about 4 200 lithic tools, 3 400 fauna fossils, two fire-use remains and two animal footprint features were found in Loc.1 during the excavation from 2010 to 2011. The dating results show that the age of Wulanmulun site is 65—50 kya, and it falls into the range of Middle Palaeolithic period. Previous use-wear study on 491 lithic tools from Wulanmulun site has yielded abundant results, enabling the construction of a use-wear database. In this research, 254 lithic tools with 296 function units were used as samples. The measurement of lithic tools, including the length, width, the ratio of length to width, thickness, weight, raw material, type, layer and coordinate, as well as the use-wear data, including movement, processing materials and behaviours, are taken into consideration. After being pre-processed and standardized, all data are loaded into Python for further processing with Apriori and K-means algorithm. The Apriori algorithm is used to summarize the general characteristics of use-wear data, while the K-means algorithm aims to identify the possible function areas at the site. The conclusions of this research are as follows: (1)More than 80% of lithic tools are used for butchering, indicating that there could be a butchering area at the Loc.1 of Wulanmulun site. (2) The result of the Apriori algorithm suggests that lithic tools with similar functions are alike in the length, width, the ratio of length to width, thickness and weight, which could be the consequence of intentional selections of hominin. (3) According to the result of the K-means algorithm, there is a special sequence at around 270cm below the ground at the Loc.1 of Wulanmulun site, where butchering activities concentrated. This research is one of the first application of data mining in the field of Palaeolithic archaeology, displaying a great potential of this method. As a quantifying approach, data mining could process archaeological data efficiently, objectively and accurately. Therefore, it can be used as a powerful tool in future archaeological studies. For its wide application, archaeological data should be collected thoroughly during excavations, to improve the degree of the accuracy and scientificity of archaeological researches.
|