- Table View
- List View
Speech Recognition Using Articulatory and Excitation Source Features
by K. Sreenivasa Rao Manjunath K EThis book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Speech Spectrum Analysis
by Sean A. FulopThe accurate determination of the speech spectrum, particularly for short frames, is commonly pursued in diverse areas including speech processing, recognition, and acoustic phonetics. With this book the author makes the subject of spectrum analysis understandable to a wide audience, including those with a solid background in general signal processing and those without such background. In keeping with these goals, this is not a book that replaces or attempts to cover the material found in a general signal processing textbook. Some essential signal processing concepts are presented in the first chapter, but even there the concepts are presented in a generally understandable fashion as far as is possible. Throughout the book, the focus is on applications to speech analysis; mathematical theory is provided for completeness, but these developments are set off in boxes for the benefit of those readers with sufficient background. Other readers may proceed through the main text, where the key results and applications will be presented in general heuristic terms, and illustrated with software routines and practical "show-and-tell" discussions of the results. At some points, the book refers to and uses the implementations in the Praat speech analysis software package, which has the advantages that it is used by many scientists around the world, and it is free and open source software. At other points, special software routines have been developed and made available to complement the book, and these are provided in the Matlab programming language. If the reader has the basic Matlab package, he/she will be able to immediately implement the programs in that platform---no extra "toolboxes" are required.
Speech Technology: Theory and Applications (Wiley Series In Agent Technology Ser. #11)
by Kristiina Jokinen Fang ChenThis book gives an overview of the research and application of speech technologies in different areas. One of the special characteristics of the book is that the authors take a broad view of the multiple research areas and take the multidisciplinary approach to the topics. One of the goals in this book is to emphasize the application. User experience, human factors and usability issues are the focus in this book.
Speech and Audio Processing for Coding, Enhancement and Recognition
by Tokunbo Ogunfunmi Roberto Togneri Madihally Sim NarasimhaThis book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas.
Speech and Audio Processing: A MATLAB®-based Approach
by Ian Vince McloughlinWith this comprehensive and accessible introduction to the field, you will gain all the skills and knowledge needed to work with current and future audio, speech, and hearing processing technologies. Topics covered include mobile telephony, human-computer interfacing through speech, medical applications of speech and hearing technology, electronic music, audio compression and reproduction, big data audio systems and the analysis of sounds in the environment. All of this is supported by numerous practical illustrations, exercises, and hands-on MATLAB examples on topics as diverse as psychoacoustics (including some auditory illusions), voice changers, speech compression, signal analysis and visualisation, stereo processing, low-frequency ultrasonic scanning, and machine learning techniques for big data. With its pragmatic and application driven focus, and concise explanations, this is an essential resource for anyone who wants to rapidly gain a practical understanding of speech and audio processing and technology.
Speech and Computer
by Rodmonga Potapova Alexey Karpov Iosif MporasThis book constitutes the refereed proceedings of the 16th International Conference on Speech and Computer, SPECOM 2014, held in Novi Sad, Serbia. The 56 revised full papers presented together with 3 invited talks were carefully reviewed and selected from 100 initial submissions. It is a conference with long tradition that attracts researchers in the area of computer speech processing (recognition, synthesis, understanding etc. ) and related domains (including signal processing, language and text processing, multi-modal speech processing or human-computer interaction for instance).
Speech and Computer
by Andrey Ronzhin Rodmonga Potapova Nikos FakotakisThis book constitutes the refereed proceedings of the 17th International Conference on Speech and Computer, SPECOM 2015, held in Athens, Greece, in September 2015. The 59 revised full papers presented together with 2 invited talks were carefully reviewed and selected from 104 initial submissions. The papers cover a wide range of topics in the area of computer speech processing such as recognition, synthesis, and understanding and related domains including signal processing, language and text processing, multi-modal speech processing or human-computer interaction.
Speech and Computer
by Andrey Ronzhin Rodmonga Potapova Géza NémethThis book constitutes the refereed proceedings of the 16th International Conference on Speech and Computer, SPECOM 2014, held in Novi Sad, Serbia. The 56 revised full papers presented together with 3 invited talks were carefully reviewed and selected from 100 initial submissions. It is a conference with long tradition that attracts researchers in the area of computer speech processing (recognition, synthesis, understanding etc. ) and related domains (including signal processing, language and text processing, multi-modal speech processing or human-computer interaction for instance).
Speech and Computer: 20th International Conference, SPECOM 2018, Leipzig, Germany, September 18–22, 2018, Proceedings (Lecture Notes in Computer Science #11096)
by Rodmonga Potapova Alexey Karpov Oliver JokischThis book constitutes the proceedings of the 20th International Conference on Speech and Computer, SPECOM 2018, held in Leipzig, Germany, in September 2018.The 79 papers presented in this volume were carefully reviewed and selected from 132 submissions. The papers present current research in the area of computer speech processing, including recognition, synthesis, understanding and related domains like signal processing, language and text processing, computational paralinguistics, multi-modal speech processing or human-computer interaction.
Speech and Computer: 21st International Conference, SPECOM 2019, Istanbul, Turkey, August 20–25, 2019, Proceedings (Lecture Notes in Computer Science #11658)
by Albert Ali Salah Rodmonga Potapova Alexey KarpovThis book constitutes the proceedings of the 21st International Conference on Speech and Computer, SPECOM 2019, held in Istanbul, Turkey, in August 2019.The 57 papers presented were carefully reviewed and selected from 86 submissions. The papers present current research in the area of computer speech processing including audio signal processing, automatic speech recognition, speaker recognition, computational paralinguistics, speech synthesis, sign language and multimodal processing, and speech and language resources.
Speech and Computer: 22nd International Conference, SPECOM 2020, St. Petersburg, Russia, October 7–9, 2020, Proceedings (Lecture Notes in Computer Science #12335)
by Rodmonga Potapova Alexey KarpovThis book constitutes the proceedings of the 22nd International Conference on Speech and Computer, SPECOM 2020, held in St. Petersburg, Russia, in October 2020. The 65 papers presented were carefully reviewed and selected from 160 submissions. The papers present current research in the area of computer speech processing including speech science, speech technology, natural language processing, human-computer interaction, language identification, multimedia processing, human-machine interaction, deep learning for audio processing, computational paralinguistics, affective computing, speech and language resources, speech translation systems, text mining and sentiment analysis, voice assistants, etc.Due to the Corona pandemic SPECOM 2020 was held as a virtual event.
Speech and Computer: 23rd International Conference, SPECOM 2021, St. Petersburg, Russia, September 27–30, 2021, Proceedings (Lecture Notes in Computer Science #12997)
by Rodmonga Potapova Alexey KarpovThis book constitutes the proceedings of the 23rd International Conference on Speech and Computer, SPECOM 2021, held in St. Petersburg, Russia, in September 2021.* The 74 papers presented were carefully reviewed and selected from 163 submissions. The papers present current research in the area of computer speech processing including audio signal processing, automatic speech recognition, speaker recognition, computational paralinguistics, speech synthesis, sign language and multimodal processing, and speech and language resources.*Due to the COVID-19 pandemic, SPECOM 2021 was held as a hybrid event.
Speech and Computer: 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings (Lecture Notes in Computer Science #13721)
by S. R. Mahadeva Prasanna Alexey Karpov K. Samudravijaya Shyam S. AgrawalThis book constitutes the proceedings of the 24th International Conference on Speech and Computer, SPECOM 2022, held as a hybrid event in Gurugram, India, in November 2022.The 51 full and 9 short papers presented in this volume were carefully reviewed and selected from 99 submissions. The papers present current research in the area of computer speech processing including audio signal processing, automatic speech recognition, speaker recognition, computational paralinguistics, speech synthesis, sign language and multimodal processing, and speech and language resources.
Speech and Computer: 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part I (Lecture Notes in Computer Science #14338)
by S. R. Mahadeva Prasanna Alexey Karpov Rajesh M. Hegde K. Samudravijaya Shyam S. Agrawal K. T. DeepakThe two-volume proceedings set LNAI 14338 and 14339 constitutes the refereed proceedings of the 25th International Conference on Speech and Computer, SPECOM 2023, held in Dharwad, India, during November 29–December 2, 2023.The 94 papers included in these proceedings were carefully reviewed and selected from 174 submissions. They focus on all aspects of speech science and technology: automatic speech recognition; computational paralinguistics; digital signal processing; speech prosody; natural language processing; child speech processing; speech processing for medicine; industrial speech and language technology; speech technology for under-resourced languages; speech analysis and synthesis; speaker and language identification, verification and diarization.
Speech and Computer: 25th International Conference, SPECOM 2023, Dharwad, India, November 29 – December 2, 2023, Proceedings, Part II (Lecture Notes in Computer Science #14339)
by S. R. Mahadeva Prasanna Alexey Karpov Rajesh M. Hegde K. Samudravijaya Shyam S. Agrawal K. T. DeepakThe two-volume proceedings set LNAI 14338 and 14339 constitutes the refereed proceedings of the 25th International Conference on Speech and Computer, SPECOM 2023, held in Dharwad, India, during November 29–December 2, 2023.The 94 papers included in these proceedings were carefully reviewed and selected from 174 submissions. They focus on all aspects of speech science and technology: automatic speech recognition; computational paralinguistics; digital signal processing; speech prosody; natural language processing; child speech processing; speech processing for medicine; industrial speech and language technology; speech technology for under-resourced languages; speech analysis and synthesis; speaker and language identification, verification and diarization.
Speech and Computer: 26th International Conference, SPECOM 2024, Belgrade, Serbia, November 25–28, 2024, Proceedings, Part I (Lecture Notes in Computer Science #15299)
by Alexey Karpov Vlado DelićThe two-volume set LNAI 15299 and 15300 constitutes the refereed proceedings of the 26th International Conference on Speech and Computer, SPECOM 2024, held in Belgrade, Serbia, during November 25–28, 2024. The 53 full papers included in these proceedings were carefully reviewed and selected from 90 submissions. The book also contains two invited talks in full paper length. The papers are organized in the following topical sections: Volume I: Invited papers; automatic speech recognition; speech and language resources; speech synthesis and perception; and speech processing for medicine. Volume II: Computational paralinguistics; affective computing; speaker recognition; digital speech processing; natural language processing.
Speech and Computer: 26th International Conference, SPECOM 2024, Belgrade, Serbia, November 25–28, 2024, Proceedings, Part II (Lecture Notes in Computer Science #15300)
by Alexey Karpov Vlado DelićThe two-volume set LNAI 15299 and 15300 constitutes the refereed proceedings of the 26th International Conference on Speech and Computer, SPECOM 2024, held in Belgrade, Serbia, during November 25–28, 2024. The 53 full papers included in these proceedings were carefully reviewed and selected from 90 submissions. The book also contains two invited talks in full paper length. The papers are organized in the following topical sections: Volume I: Invited papers; automatic speech recognition; speech and language resources; speech synthesis and perception; and speech processing for medicine. Volume II: Computational paralinguistics; affective computing; speaker recognition; digital speech processing; natural language processing.
Speech and Language Processing for Human-Machine Communications
by S. S. Agrawal Amita Devi Ritika Wason Poonam BansalThis volume comprises the select proceedings of the annual convention of the Computer Society of India. Divided into 10 topical volumes, the proceedings present papers on state-of-the-art research, surveys, and succinct reviews. The volumes cover diverse topics ranging from communications networks to big data analytics, and from system architecture to cyber security. This volume focuses on Speech and Language Processing for Human-Machine Communications. The contents of this book will be useful to researchers and students alike.
Speech and Language Technologies for Low-Resource Languages: First International Conference, SPELLL 2022, Kalavakkam, India, November 23–25, 2022, Proceedings (Communications in Computer and Information Science #1802)
by Thomas Mandl Anand Kumar M Bharathi Raja Chakravarthi Bharathi B Colm O’Riordan Hema Murthy Thenmozhi DurairajThis book constitutes refereed proceedings from the First International Conference on Speech and Language Technologies for Low-resource Languages, SPELLL 2022, held in Kalavakkam, India, in November 2022. The 25 presented papers were thoroughly reviewed and selected from 70 submissions. The papers are organised in the following topical sections: language resources; language technologies; speech technologies; multimodal data analysis; fake news detection in low-resource languages (regional-fake); low resource cross-domain, cross-lingualand cross-modal offensie content analysis (LC4).
Speech and Language Technologies for Low-Resource Languages: Second International Conference, SPELLL 2023, Perundurai, Erode, India, December 6–8, 2023, Revised Selected Papers (Communications in Computer and Information Science #2046)
by Preslav Nakov Bharathi Raja Chakravarthi Bharathi B Miguel Ángel García Cumbreras Salud María Jiménez Zafra Malliga Subramanian Kogilavani ShanmugavadivelThis book constitutes the refereed conference proceedings of the second International Conference on Speech and Language Technologies for Low-Resource Languages, SPELLL 2023, held in Perundurai, Erode, India, during December 6–8, 2023. The 27 full papers and 6 short papers presented in this book were carefully reviewed and selected from 94 submissions. The papers are divided into the following topical sections: language resources; language technologies; speech technologies; and workshops - regional fake, MMLOW, LC4.
Speech-to-Speech Translation (SpringerBriefs in Computer Science)
by Yutaka Kidawara Eiichiro Sumita Hisashi KawaiThis book provides the readers with retrospective and prospective views with detailed explanations of component technologies, speech recognition, language translation and speech synthesis.Speech-to-speech translation system (S2S) enables to break language barriers, i.e., communicate each other between any pair of person on the glove, which is one of extreme dreams of humankind.People, society, and economy connected by S2S will demonstrate explosive growth without exception.In 1986, Japan initiated basic research of S2S, then the idea spread world-wide and were explored deeply by researchers during three decades.Now, we see S2S application on smartphone/tablet around the world.Computational resources such as processors, memories, wireless communication accelerate this computation-intensive systems and accumulation of digital data of speech and language encourage recent approaches based on machine learning.Through field experiments after long research in laboratories, S2S systems are being well-developed and now ready to utilized in daily life.Unique chapter of this book is end-2-end evaluation by comparing system’s performance and human competence. The effectiveness of the system would be understood by the score of this evaluation.The book will end with one of the next focus of S2S will be technology of simultaneous interpretation for lecture, broadcast news and so on.
Speed Metrics Guide: Choosing the Right Metrics to Use When Evaluating Websites
by Matthew EdgarFaster websites offer a better user experience and typically have higher conversion rates. It can be challenging to know where to invest to meaningfully improve a website's speed. Investing correctly to improve speed starts with understanding how to correctly measure speed and knowing how to use those measurements to identify the biggest opportunities. Speed Metrics Guide helps marketers, SEOs, business leaders, designers, and everybody else involved in website performance select the right metrics to use to optimize their website's speed. Each chapter examines a specific metric, discusses what it measures, why the metric matters and what tactics will help improve that metric. What You'll LearnThe best metrics and tools to help you measure website speed, including Google's Core Web VitalsHow and when to best use each metricWhere each metric fits within the website loading processHow to use each metric to find different ways of improving website speed Who This book Is ForNon-technical audience, including marketers, SEOs, designers, and UX professionals.
Speed Up Your Python with Rust: Optimize Python performance by creating Python pip modules in Rust with PyO3
by Maxwell FlittonDiscover how to inject your code with highly performant Rust features to develop fast and memory-safe applicationsKey FeaturesLearn to implement Rust in a Python system without altering the entire systemWrite safe and efficient Rust code as a Python developer by understanding the essential features of RustBuild Python extensions in Rust by using Python NumPy modules in your Rust codeBook DescriptionPython has made software development easier, but it falls short in several areas including memory management that lead to poor performance and security. Rust, on the other hand, provides memory safety without using a garbage collector, which means that with its low memory footprint, you can build high-performant and secure apps relatively easily. However, rewriting everything in Rust can be expensive and risky as there might not be package support in Rust for the problem being solved. This is where Python bindings and pip come in.This book will help you, as a Python developer, to start using Rust in your Python projects without having to manage a separate Rust server or application. Seeing as you'll already understand concepts like functions and loops, this book covers the quirks of Rust such as memory management to code Rust in a productive and structured manner. You'll explore the PyO3 crate to fuse Rust code with Python, learn how to package your fused Rust code in a pip package, and then deploy a Python Flask application in Docker that uses a private Rust pip module. Finally, you'll get to grips with advanced Rust binding topics such as inspecting Python objects and modules in Rust.By the end of this Rust book, you'll be able to develop safe and high-performant applications with better concurrency support.What you will learnExplore the quirks of the Rust programming language that a Python developer needs to understand to code in RustUnderstand the trade-offs for multiprocessing and thread safety to write concurrent codeBuild and manage a software project with cargo and cratesFuse Rust code with Python so that Python can import and run Rust codeDeploy a Python Flask application in Docker that utilizes a private Rust pip moduleInspect and create your own Python objects in RustWho this book is forThis book is for Python developers who want to speed up their Python code with Rust and implement Rust in a Python system without altering the entire system. You'll be able to learn about all topics relating to Rust programming. Basic knowledge of Python is required to get the most out of this book.
Speed, Data, and Ecosystems: Excelling in a Software-Driven World (Chapman & Hall/CRC Innovations in Software Engineering and Software Development Series)
by Jan BoschAs software R&D investment increases, the benefits from short feedback cycles using technologies such as continuous deployment, experimentation-based development, and multidisciplinary teams require a fundamentally different strategy and process. This book will cover the three overall challenges that companies are grappling with: speed, data and ecosystems. Speed deals with shortening the cycle time in R&D. Data deals with increasing the use of and benefit from the massive amounts of data that companies collect. Ecosystems address the transition of companies from being internally focused to being ecosystem oriented by analyzing what the company is uniquely good at and where it adds value.
Speeding-Up Radio-Frequency Integrated Circuit Sizing with Neural Networks (SpringerBriefs in Applied Sciences and Technology)
by Nuno C. Lourenço Ricardo M. Martins Nuno C. Horta João L. Domingues Pedro J. Vaz António P. GusmãoIn this book, innovative research using artificial neural networks (ANNs) is conducted to automate the sizing task of RF IC design, which is used in two different steps of the automatic design process. The advances in telecommunications, such as the 5th generation broadband or 5G for short, open doors to advances in areas such as health care, education, resource management, transportation, agriculture and many other areas. Consequently, there is high pressure in today’s market for significant communication rates, extensive bandwidths and ultralow-power consumption. This is where radiofrequency (RF) integrated circuits (ICs) come in hand, playing a crucial role. This demand stresses out the problem which resides in the remarkable difficulty of RF IC design in deep nanometric integration technologies due to their high complexity and stringent performances. Given the economic pressure for high quality yet cheap electronics and challenging time-to-market constraints, there is an urgent need for electronic design automation (EDA) tools to increase the RF designers’ productivity and improve the quality of resulting ICs. In the last years, the automatic sizing of RF IC blocks in deep nanometer technologies has moved toward process, voltage and temperature (PVT)-inclusive optimizations to ensure their robustness. Each sizing solution is exhaustively simulated in a set of PVT corners, thus pushing modern workstations’ capabilities to their limits. Standard ANNs applications usually exploit the model’s capability of describing a complex, harder to describe, relation between input and target data. For that purpose, ANNs are a mechanism to bypass the process of describing the complex underlying relations between data by feeding it a significant number of previously acquired input/output data pairs that the model attempts to copy. Here, and firstly, the ANNs disrupt from the most recent trials of replacing the simulator in the simulation-based sizing with a machine/deep learning model, by proposing two different ANNs, the first classifies the convergence of the circuit for nominal and PVT corners, and the second predicts the oscillating frequencies for each case. The convergence classifier (CCANN) and frequency guess predictor (FGPANN) are seamlessly integrated into the simulation-based sizing loop, accelerating the overall optimization process. Secondly, a PVT regressor that inputs the circuit’s sizing and the nominal performances to estimate the PVT corner performances via multiple parallel artificial neural networks is proposed. Two control phases prevent the optimization process from being misled by inaccurate performance estimates. As such, this book details the optimal description of the input/output data relation that should be fulfilled. The developed description is mainly reflected in two of the system’s characteristics, the shape of the input data and its incorporation in the sizing optimization loop. An optimal description of these components should be such that the model should produce output data that fulfills the desired relation for the given training data once fully trained. Additionally, the model should be capable of efficiently generalizing the acquired knowledge in newer examples, i.e., never-seen input circuit topologies.