Data-driven knowledge discovery forms an integral, bottom-up part of the Neoclassica research framework, identifying by means of algorithms in statistical analysis and machine learning cultural patterns such as constructional or ornamental features. It bears in particular the potential to uncover hitherto unknown patterns in the source data and assist in challenging preconceived notions of the dissemination of forms and the evolution of stylistic patterns.
Currently we are experimenting with several algorithms that allow for the classification of objects and of their features in images. Among them are Deep learning approaches both with Convolutional Neural Networks (CNNs) and Regional Convolution Neural Networks (RCNNs). The current hypotheses is that while CNNs may be perfectly suitable for classifying objects, RCNNs will provide a better foundation for classifying multiple objects within one image as well as features of an object. In the future we strive to combine this analysis with semantic technologies for providing a truely multimodal analysis.
The major impact of automating the classifications of forms and artefacts for researching an curating objects will be:
- to shift the focus from conspicuous pieces of art to a broader perspective on art as material culture;
- provision a method to explore the vast amount of underdocumented objects by relating them to each other on the basis of formal features;
- provision a method to analyse existing corpora (Catalogues Raisonnés, Museum Collections) to better understand their shape as a forming of cultural memory (particularly if combined with multimodal analysis of the text and the visual).
Building a knowledge discovery module for any science is a practice constantly faced with the issue of accounting both for intellectual propery rights and openness. In order to provide a rich-yet-open corpus we decided to choose a multilayered approach.