Sabiia Seb
PortuguêsEspañolEnglish
Embrapa
        Busca avançada

Botão Atualizar


Botão Atualizar

Ordenar por: RelevânciaAutorTítuloAnoImprime registros no formato resumido
Registros recuperados: 25
Primeira ... 12 ... Última
Imagem não selecionada

Imprime registro no formato completo
A Factored Language Model for Prosody Dependent Speech Recognition InTech
Ken Chen; Mark A. Hasegawa-Johnson; Jennifer S. Cole.
In this chapter, we proposed a novel approach that improves the robustness of prosody dependent language modeling by leveraging the dependence between prosody and syntax. In our experiments on Radio News Corpus, a factorial prosody dependent language model estimated using our proposed approach has achieved as much as 31% reduction of the joint perplexity over a prosody dependent language model estimated using the standard Maximum Likelihood approach. In recognition experiments, our approach results in a 1% improvement in word recognition accuracy, 0.7% improvement in accent recognition accuracy and 1.5% improvement in intonational phrase boundary (IPB) recognition accuracy over the baseline prosody dependent recognizer. The study in the chapter shows that
Tipo: 18 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/a_factored_language_model_for_prosody_dependent_speech_recognition
Imagem não selecionada

Imprime registro no formato completo
A General Approximation-Optimization Approach to Large Margin Estimation of HMMs InTech
Hui Jiang; Xinwei Li.
In this paper, we have proposed a general Approximation-optiMization (AM) approach for large margin estimation (LME) of Gaussian mixture HMMs in speech recognition. Each iteration of the AM method consists of A-step and M-step. In A-step, the original LME problem is approximated by a simple convex optimization problem in a close proximity of initial model parameters. In M-step, the approximate convex optimization problem is solved by using efficient convex optimization algorithms. The AM method is a general approach which can be easily applied for discriminative training of statistical models with hidden variables. In this paper, we introduce two examples to apply the AM approach to LME of Gaussian mixture HMMs. The first method uses V-approx and is...
Tipo: 7 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/a_general_approximation-optimization_approach_to_large_margin_estimation_of_hmms
Imagem não selecionada

Imprime registro no formato completo
An Improved GA Based Modified Dynamic Neural Network for Cantonese-Digit Speech Recognition InTech
S.H. Ling; F.H.F. Leung; K.F. Leung; H.K. Lam; H.H.C. Iu.
A proposed Cantonese-digit speech recognizer by using a GA-based modified dynamic recurrent neural network has been developed. The structure of the modified neural network consists of two parts: a rule-base 3-layer feed-forward neural network and a classifier 3-layer recurrent neural network. The network parameters are trained by an improved GA. With this specific network structure, the dynamic feature of the speech signals can be generalized and the parameter values of the network can adapt to the values of the input data set. Cantonese digits 0 to 10, 12 and 20 have been used to demonstrate the merits of the proposed network. By using the proposed dynamic network, the dynamic and static information of the speech can be modeled effectively. Therefore,...
Tipo: 21 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/an_improved_ga_based_modified_dynamic_neural_network_for_cantonese-digit_speech_recognition
Imagem não selecionada

Imprime registro no formato completo
Analysis and Implementation of an Automated Delimiter of "Quranic" Verses in Audio Files using Speech Recognition Techniques InTech
Tabbal Hassan; Al-Falou Wassim; Monla Bassem.
The system that we developed showed promising results although it was only tested against small Quran’ chapters. We think that the incorporation of morphological knowledge of the Arabic language with a more sophisticated statistical model deduced from the full scope of
Tipo: 20 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/analysis_and_implementation_of_an_automated_delimiter_of__quranic__verses_in_audio_files_using_speec
Imagem não selecionada

Imprime registro no formato completo
Audio Visual Speech Recognition and Segmentation Based on DBN Models InTech
Dongmei Jiang; Guoyun Lv; Ilse Ravyse; Xiaoyue Jiang; Yanning Zhang; Hichem Sahli; Rongchun Zhao.
In this chapter, we first implement an audio or visual single stream DBN model proposed in [Bilmes 2005], which we demonstrate that it can break through the limitation of the state-ofthe-art `whole-word-state DBN' models and output phone (viseme) segmentation results. Then we expand this model to an audio-visual multi-stream asynchronous DBN (MSADBN) model. In this MSADBN model, the asynchrony between audio and visual speech is allowed to exceed the timing boundaries of phones/visemes, in opposite to the multi-stream hidden markov models (MSHMM) or product HMM (PHMM) which constrain the audio stream and visual stream to be synchronized at the phone/viseme boundaries. In order to evaluate the performances of the proposed DBN models on word recognition and...
Tipo: 9 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/audio_visual_speech_recognition_and_segmentation_based_on_dbn_models
Imagem não selecionada

Imprime registro no formato completo
Autocorrelation-based Methods for Noise-Robust Speech Recognition InTech
Gholamreza Farahani; Mohammad Ahadi; Mohammad Mehdi Homayounpour.
In this chapter, the importance of autocorrelation domain in robust feature extraction for speech recognition was discussed. To prove the effectiveness of this domain, some recently proposed methods for robust feature extraction against additive noise were discussed. These methods resulted in cepstral feature sets derived from the autocorrelation spectral domain. The DAS algorithm used the differentiated filtered autocorrelation spectrum of the noisy signal to extract cepstral parameters. We noted that similar to RAS and DPS, DAS can better
Tipo: 14 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/autocorrelation-based_methods_for_noise-robust_speech_recognition
Imagem não selecionada

Imprime registro no formato completo
Bimodal Emotion Recognition using Speech and Physiological Changes InTech
Jonghwa Kim.
In this paper, we treated all stages of emotion analysis, from data collection to classification using short-term observations, and evaluated several fusion methods as well as a hybrid decision scheme. We also compared the results from multimodal classification with the unimodal results. As in our earlier work (Kim et al. 2005) where we relied on longer observation phases and a different set of features, the best results were obtained by featurelevel fusion method in combination with feature selection stage. In this case, not only userdependent, but also user-independent emotion classification could be improved compared to the unimodal methods.
Tipo: 15 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/bimodal_emotion_recognition_using_speech_and_physiological_changes
Imagem não selecionada

Imprime registro no formato completo
Conversation System of an Everyday Robot Robovie-IV InTech
Noriaki Mitsunaga; Zenta Miyashita; Takahiro Miyashita; Hiroshi Ishiguro; Norihiro Hagita.
This research was supported by the Ministry of Internal Affairs and Communications.
Tipo: 23 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/conversation_system_of_an_everyday_robot_robovie-iv
Imagem não selecionada

Imprime registro no formato completo
Discrete-Mixture HMMs-based Approach for Noisy Speech Recognition InTech
Tetsuo Kosaka; Masaharu Katoh; Masaki Kohda.
This chapter introduced a new method of robust speech recognition using discrete-mixture HMMs (DMHMMs) based on maximum a posteriori (MAP) estimation. The aim of this work was to develop robust speech recognition for adverse conditions which contain both stationary and non-stationary noise. In order to achieve the goal, we proposed two methods.
Tipo: 10 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/discrete-mixture_hmms-based_approach_for_noisy_speech_recognition
Imagem não selecionada

Imprime registro no formato completo
Double Layer Architectures for Automatic Speech Recognition Using HMM InTech
Marta Casar; Jose A. R. Fonollosa.
The future of speech-related technologies is connected to the improvement of speech recognition quality. Until recently, speech recognition technologies and applications had assumed that there were certain limitations regarding vocabulary length, speaker independence, and environmental noise or acoustic events. In the future, however, ASR must deal with these restrictions and it must also be able to introduce other speech-related non-acoustic information that is available in speech signals. Furthermore, HMM-based statistical modelling--the standard state-of-the-art ASR--has several time-domain limitations that are known to affect recognition performance. Context is usually represented by means of spectral dynamic features (namely, its first and second...
Tipo: 8 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/double_layer_architectures_for_automatic_speech_recognition_using_hmm
Imagem não selecionada

Imprime registro no formato completo
Early Decision Making in Continuous Speech InTech
Odette Scharenborg; Louis ten Bosch; Lou Boves.
In the laboratory, listeners are able to reliably identify polysyllabic content words before the end of the acoustic realisation (e.g., Marslen-Wilson, 1987). In real life, listeners not only use acoustic-phonetic information, but also contextual constraints to make a decision about the identity of a word. This makes it possible for listeners to guess the identity of content words even before their uniqueness point. In the research presented here, we investigated an alternative ASR system, called SpeM, that is able to recognise words during the speech recognition process for its ability for recognising words before their acoustic offset ? but after their uniqueness point ? a capability that we dubbed `early recognition'. The restriction to recognition at...
Tipo: 19 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/early_decision_making_in_continuous_speech
Imagem não selecionada

Imprime registro no formato completo
Emotion Estimation in Speech Using a 3D Emotion Space Concept InTech
Michael Grimm; Kristian Kroschel.
In this chapter we discussed the recognition of emotions in spontaneous speech. We used a general framework motivated by emotion psychology to describe emotions by means of three emotion "primitives" (attributes), namely valence, activation, and dominance. With these emotion primitives, we proposed a real-valued three-dimensional emotion space concept to overcome the limitations in the state-of-the-art emotion categorization. We tested the method on the basis of 893 spontaneous emotional utterances recorded on a German TV talk-show. For the acoustic representation of the emotion conveyed in the speech signal, we extracted 137 features. These reflected the prosody and the spectral characteristics of the speech. We tested two methods to reduce the problem of...
Tipo: 16 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/emotion_estimation_in_speech_using_a_3d_emotion_space_concept
Imagem não selecionada

Imprime registro no formato completo
Evolutionary Speech Recognition InTech
Anne Spalanzani.
Tipo: 5 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/evolutionary_speech_recognition
Imagem não selecionada

Imprime registro no formato completo
Linearly Interpolated Hierarchical N-gram Language Models for Speech Recognition Engines InTech
Imed Zitouni; Qiru Zhou.
We have investigated a new language modeling approach called linearly interpolated ngram language models. We showed in this chapter the effectiveness of this approach to estimate the likelihood of n-gram events: the linearly interpolated n-gram language models outperform the performance of both linearly interpolated n-gram language models and backoff n-gram language models in terms of perplexity and also in terms word error rate when intergrated into a speech recognizer engine. Compared to traditional backoff and linearly interpolated LMs, the originality of this approach is in the use of a class hierarchy that leads to a better estimation of the likelihood of n-gram events. Experiments on the WSJ database show that the linearly interpolated n-gram...
Tipo: 17 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/linearly_interpolated_hierarchical_n-gram_language_models_for_speech_recognition_engines
Imagem não selecionada

Imprime registro no formato completo
New Advances in Voice Activity Detection using HOS and Optimization Strategies InTech
J.M. Gorriz; J. Ramirez; C.G. Puntonet.
This paper showed three different schemes for improving speech detection robustness and the performance of speech recognition systems working in noisy environments. These methods are based on: i) statistical likelihood ratio tests (LRTs) formulated in terms of the integrated bispectrum of the noisy signal. The integrated bispectrum is defined as a cross spectrum between the signal and its square, and therefore a function of a single frequency variable. It inherits the ability of higher order statistics to detect signals in noise with many other additional advantages; ii) Hard decision clustering approach where a set of prototypes is used to characterize the noisy channel. Detecting the presence of speech is enabled by a decision rule formulated in terms of...
Tipo: 3 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/new_advances_in_voice_activity_detection_using_hos_and_optimization_strategies
Imagem não selecionada

Imprime registro no formato completo
Novel Approaches to Speech Detection in the Processing of Continuous Audio Streams InTech
Janez Zibert; Bostjan Vesnicer; France Mihelic.
This chapter addresses the problem of speech detection in continuous audio streams and explores the impact of speech/non-speech segmentation on speech-processing applications. We proposed a novel approach for deriving speech-detection features based on phoneme transcriptions from generic speech-recognition systems. The proposed phoneme-recognition features were designed to be recognizer and language independent and could be applied in different speech/non-speech segmentation-classification frameworks. In our evaluation experiments two segmentation-classification frameworks were tested, one based on the Viterbi decoding of hidden Markov models, where speech/non-speech segmentation and detection were performed simultaneously, and the other framework, where...
Tipo: 2 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/novel_approaches_to_speech_detection_in_the_processing_of_continuous_audio_streams
Imagem não selecionada

Imprime registro no formato completo
Sound Localization of Elevation using Pinnae for Auditory Robots InTech
Tomoko Shimoda; Toru Nakashima; Makoto Kumon; Ryuichi Kohzawa; Ikuro Mizumoto; Zenta Iwai.
In this chapter, by using a system consisting of two microphones and one pinna, a method for sound localization using spectral cues was considered. In particular, a robust spectral cue detection method was considered and a method for orientating the robot's head toward a sound source was proposed. In addition, this chapter considers the use of sound source separation in order to attenuate the effect of noise. The conclusions of this present study are summarized as follows: ? Real robotic pinnae were designed and a robot using the pinnae was developed. ? In order to realize sound localization with vertical displacement, an algorithm for detecting spectral cues using the developed pinnae was proposed. ? Spectral cue detection was made robust by considering...
Tipo: 24 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/sound_localization_of_elevation_using_pinnae_for_auditory_robots
Imagem não selecionada

Imprime registro no formato completo
Speech Recognition in Unknown Noisy Conditions InTech
Ji Ming; Baochun Hou.
This chapter investigated the problem of speech recognition in noisy conditions assuming absence of prior information about the noise. A method, namely universal compensation, was described, which combines multicondition model training and missing-feature theory to model noises with unknown temporal-spectral characteristics. Multicondition training can be conducted using simulated noisy data, to provide a coarse compensation for the noise, and missing-feature theory is applied to refine the compensation by ignoring noise
Tipo: 11 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/speech_recognition_in_unknown_noisy_conditions
Imagem não selecionada

Imprime registro no formato completo
Speech Recognition Under Noise Conditions: Compensation Methods InTech
Angel de la Torre; Jose C. Segura; Carmen Benitez; Javier Ramirez Luz Garcia; Antonio J. Rubio.
In this chapter, we have presented an overview of methods for noise robust speech recognition and a detailed description of the mechanism degrading the performance of speech recognizers working under noise conditions. Performance is degraded because of the mismatch between training and recognition and also because of the information loss associated to the randomness of the noise. In the group of compensation methods, we have described the VTS approach (as a representative model-based noise compensation method) and histogram equalization (a nonlinear non-model-based method). We have described the differences and advantages of each one, finding that more accurate compensation can be achieved with model-based methods, while non-model-based ones can deal with...
Tipo: 25 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/speech_recognition_under_noise_conditions__compensation_methods
Imagem não selecionada

Imprime registro no formato completo
Talking Robot and the Autonomous Acquisition of Vocalization and Singing Skill InTech
Hideyuki Sawada.
In this paper a talking and singing robot was introduced, which is constructed mechanically with human-like vocal cords and a vocal tract. By introducing the adaptive learning and controlling of the mechanical model with the auditory feedback, the robot was able to
Tipo: 22 Palavras-chave: Robust Speech Recognition and Understanding.
Ano: 2007 URL: http://www.intechopen.com/articles/show/title/talking_robot_and_the_autonomous_acquisition_of_vocalization_and_singing_skill
Registros recuperados: 25
Primeira ... 12 ... Última
 

Empresa Brasileira de Pesquisa Agropecuária - Embrapa
Todos os direitos reservados, conforme Lei n° 9.610
Política de Privacidade
Área restrita

Embrapa
Parque Estação Biológica - PqEB s/n°
Brasília, DF - Brasil - CEP 70770-901
Fone: (61) 3448-4433 - Fax: (61) 3448-4890 / 3448-4891 SAC: https://www.embrapa.br/fale-conosco

Valid HTML 4.01 Transitional