The F-TEMPO project, funded by the British Academy and JISC through the Digital Humanities Research in the Humanities scheme, is being carried out at Goldsmiths, University of London. The Principal Investigator is Prof. Tim Crawford (Dept of Computing)
Among this extra music will be complete original published works of influential composers such as Marenzio and Monteverdi, and representative amounts by Josquin, Lassus, Palestrina, da Rore, etc. Through this resource, enriched with metadata, we could explore for the first time, for example, the networks of influence, distribution and fashion, and the effects on these of political, religious and social change over time, as represented in the output of the burgeoning 16th-century music publishing industry
Vast amounts of music, mostly audio tracks, are now available using services such as Spotify, iTunes or YouTube. Music Information Retrieval (MIR) has mostly focused on audio material to make discovery and retrieval feasible from the Internet, with an emphasis from the music industry on the requirements of their paying customers
Music in graphic form is also available online as PDF files rendering page-images of either original musical documents or modern, computer-generated music notation. Such resources are a surrogate for traditional paper-based books used in traditional musicology, but offer few advantages beyond convenience. They don’t give the facility of full-text search, unlike the text-based and numerical materials which are increasingly the subject of ‘distant reading’ investigations in the digital humanities
Our OMR program is Aruspix, which is highly reliable on good images from EMO, even though they have been digitised from microfilm.
Although OMR is far from perfect, users will often be happy using the methods of computer science on large collections containing noise. This is the principle behind searches in Google Books, based on Optical Character Recognition (OCR). Another online resource which has inspired F-TEMPO, this time based on images of Japanese woodblock prints, is Ukiyo-e Search, which permits instantaneous identification of even a poor-quality image of a print taken with a mobile phone
SIMSSA is using OMR and MIR to work towards a very large virtual and distributed collection of music accessible in the way we envisage for musicologists and all types of other musicians. As Associate Partners in SIMSSA F-TEMPO should be seen as a contribution to this international effort
From the OMR output, we extract diatonic pitch-interval strings, which are robust to several types of OMR error involving wrong clefs, key-signatures and accidentals, for each page. From these we derive sets of features recently developed for bioinformatics analysis and retrieval, Minimal Absent Words. These are used as an index for fast and scalable search and retrieval.
Philippe Vendrix (President, University of Tours, France)