Version 6.04 (002)

ANTS project

Automatic Newscast Transcription System

Active project

It was built a platform, called ANTS, targeted analysis auto news. It is a modular system in which several of the audiovisual signal analysis tools are coordinated by a process management engine, which has the task of making sequential stages of the preparation, and to aggregate the resulting data.

The key technology used is the Automatic Speech Recognition (ASR) able to provide a faithful verbatim transcript of what is said within the program.

The recognizer ASR employed has been optimized to work in the domain of news through training of a large number of news broadcasts transcribed manually. The quality of transcription obtained is around 90% of correct recognition. Since the text is synchronized with the multimedia signal, given a word can be accessed immediately at the passage in which it is pronounced. In addition, the recognizer makes a signal segmentation by voice imprint of who parla.Il transcript that you get lends itself perfectly to the search for free text and reworking with artificial intelligence techniques.

Segmentation in news

Automatic segmentation in news is performed by a module designed and developed by the Research Centre, based on the analysis of the audiovisual content.

The basic concept used is simple, but requires a priori knowledge of the format of the program. In the case of RAI newscasts, news is generally announced by the studio host and then deepened with external services.

Being able to identify the sequences in which the conductor appears (or alternatively the study) are thus obtained intersection points coincide with the news exchange. The identification of such sequences is carried out using a module that divides the video into scenes from a homogeneous content and, therefore, the group by similarity.

Analyzing the groupings obtained is then possible to select those with the most similar characteristics to those of the model chosen.

To strengthen the hypothesis made using segmentation voices carried from ASR module, making the assumption that the voice of the conductor in the studio is the one that occurs more often. In this way we are minimized recognition errors of the individual instrument. The correct exchange resulting identification information has an average accuracy of about 80%. Performed segmentation in news, a semantic analysis module is attached. This allows an automatic classification of a text according to the format used by Rai’s documenting, based on 28 main categories relating to journalism. The accuracy achieved is comparable to that of a human classifier.

The results of the outlined processes are published in synoptically through a web browser interface, exemplified in FIG.

References

  1. R. Del Pero, G. Dimino, M. Stroppiana, “Media Library: the Rai experience “, Electronics and Telecommunications, n. 1, April 2000.
  2. C. Anderson, “The Long Tail” Wired, October 2004.
  3. A. Messina, R. Borgotallo, G. Dimino, L. Boch, D. Airola Gnota, “An Automatic Indexing System for Television Newscasts”, IEEE ICME 2008, Hannover, June 2008.
  4. R. Borgotallo, G. Dimino, A. Messina, “ANTS: a complete system for automatic news programme
    annotation based on audiovisual content and text analysis”, EBU Technical Review nr. 313, Geneva, March 2008.
  5. A. Messina, M. Montagnuolo, “A Generalised Cross-Modal Clustering Method Applied to Multimedia News Semantic Indexing and Retrieval”, 18th International Conference on World Wide Web, Madrid, April 2009.
  6. M. Montagnuolo, M. Ferri, A. Messina, “HMNews: an Integrated System for Searching and Browsing Hypermedia News Content”, HyperText 2009, Torino, June 2009.
  7. L.R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition”, Proceedings of the IEEE, February 1989, Volume: 77, Issue: 2, page(s): 257-286.
  8. F. Brugnara, M. Cettolo, M. Federico, D.Giuliani, “A system for the segmentation and transcription of Italian radio news”, in
    Proceedings of RIAO, Content-Based Multimedia Information Access, Paris, France, 2000.
  9. R. Basili, M. Cammisa, E. Donati, “RitroveRAI: A Web Application for Semantic Indexing and Hyperlinking of Multimedia News”, in Proc. of “International Semantic Web Conference”, Lecture Notes in Computer Science, LNCS 3279, 97-111, Springer, 2005.

Related Projects

Closed project

European project PrestoPRIME

Keeping Audiovisual Content Alive

The European PrestoPRIME project aims to study and develop solutions that contribute to the creation of a framework for long-term preservation of digital audiovisual products.

Active project

Hyper Media News

System for automatic content analysis

Hyper Media News is a system able to integrate the information automatically generated by ANTS with the information on the web and made available through the daily online information sites.
The Hyper Media News system, together with ANTS, has already been successfully demonstrated on several occasions during international scientific conferences and during the Prix Italia 2009.

Active project

Archives and digital Thecae

A huge archive and information technology to turn it into a working tool, in a handy storage, in a familiar territory and just a click away. It ‘a complex enterprise that aims to computerize the Teche Rai, with problematic issues on which you are measuring a multi-year business plan.

Active project

System for scanning News

Digital production of computerized News is under construction. The production model of news and associated headings is based on the exploitation of contents acquired not long before publication. In fact a certain theme is “hot” for a few days, until it is overtaken by some new event.

Active project

Computerization of Production

We need to move from a television production model based on vertical supply chains to a horizontal integration. From an organizational and technological structure based on separate joints and specialized on a particular distribution channel, to an integrated model capable of producing and distributing content on heterogeneous channels.