![]() That’s why the Speech-to-Text API v2 features totally new pricing. Google heard from customers that price can be just as important as quality for many workloads. Introducing new pricing, tiers, and options Our early adopters have seen major strides in customer engagement, thanks to the market-leading accuracy and language coverage of the new model, and we cannot wait to see what opportunities our enterprise customers will unlock. Following extensive testing and feedback from our customers, Google are making the power of pre-trained large models accessible through a simple enterprise-grade API surface. In May 2023, at Google I/O, Google announced Chirp in Private Preview, the latest version of the USM family, fine-tuned for our Cloud-specific use-cases.Ĭhirp is now GA through the Speech-to-Text v2 API. Increasing accuracy at Enterprise Scale with ChirpĪs part of Google continuous investment in foundational speech models, in March 2023 Google released research results for our Universal Speech Mode (USM), a family of state-of-the-art speech models with 2B parameters and support for transcriptions of 300+ languages. Audio Format Auto-Detection: Instead of having our users analyze and manually define the audio configuration settings to pass in a transcription request, the new Speech-to-Text V2 API detects settings like encoding, sampling rate, and channel count, then automatically populates the request configuration parameters.Since the recognizers are defined as named entities, customers can partition traffic based on the recognizer of interest or collectively. Cloud Logging: Requests performed using a recognizer object automatically support cloud logging by default.This resourceful implementation of recognizers allows for greater flexibility in authentication and authorization, as users are not longer required to set up dedicated service accounts. Once-created, the recognizer can be referenced to every subsequent transcription request, eliminating the need for users to repeatedly define the same configuration parameters. Recognizers: A user-defined named configuration that combines a model identifier, the language-locale of the audio to be transcribed, and the cloud region for the transcription model to run.In addition to giving users the flexibility to deploy in any region, Google are adding a number of new features to help developers build on the API: We listened carefully to this feedback, and starting today, Google Speech-to-Text v2 API supports full regionalization, allowing our customers to invoke identical copies of all our transcription models in the Google Cloud Platform region of their choice. However, a unified view of our Speech-to-Text service has been a crucial request for our enterprise customers who need to satisfy data residency and compliance requirements, especially in regulated industries like banking and public sector. This robust, well-connected network has been the backbone of our offering for all of our customers. Since the official launch of Speech-to-Text API back in 2017, Google utilized Google’s global infrastructure to host and monitor our production-facing transcription models. Let’s have a more thorough look though at the enhanced features of Speech-to-Text API V2 and illustrate how your business can benefit from our new capabilities: Expanding Speech-to-Text features with V2 API All of these are Generally Available to Google Cloud Platform customers and users starting today. This new infrastructure also allows us to serve a wide variety of new types of models, including Chirp , our latest 2B-parameter large speech model. This new version of our API also allows us to take advantage of significant cost savings in our serving path, and as such we are reducing our base price, as well as adding pricing incentives for large workloads and those willing to accept longer turnaround times. ![]() It also migrates all of our existing functionality, so you can use the same models and features that you were using in STT v1 or v1p1beta1 APIs. Speech-to-Text v2 modernizes our API interface and introduces several new features. We want to constantly evolve our offerings and bring new benefits to organizations, which is why today, Google excited to announce the GA release of Google new Speech-to-Text v2 API. With the Speech API, Google have been pleased to serve thousands of customers and provide industry-leading speech recognition quality and cost-effective products across a range of industries. ![]() It comes as no surprise, then, that Google Cloud’s Speech API has become a crucial tool for enterprise customers, launched to general availability (GA) over six years ago and, now, processing over 1 billion voice minutes each month. As one of the most innate and ubiquitous forms of expression, speech is a fundamental pillar of human interaction.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |