Summary
The huggingface_hub integration instruments four InferenceClient / AsyncInferenceClient task methods — chat_completion, text_generation, feature_extraction, and sentence_similarity — but does not instrument automatic_speech_recognition(). This method generates a text transcript from an audio input and is a stable, production generative execution surface in the HuggingFace Hub SDK.
The OpenAI integration already instruments the equivalent surface (audio.transcriptions.create() via TranscriptionsPatcher), making this an asymmetry across providers.
What is missing
| Method |
Task |
Instrumented? |
InferenceClient.chat_completion() |
Chat generation |
Yes |
InferenceClient.text_generation() |
Text generation |
Yes |
InferenceClient.feature_extraction() |
Embeddings |
Yes |
InferenceClient.sentence_similarity() |
Semantic similarity |
Yes |
InferenceClient.automatic_speech_recognition() |
Audio → text transcription |
No |
AsyncInferenceClient.automatic_speech_recognition() |
Audio → text transcription (async) |
No |
automatic_speech_recognition() is distinct from the three tasks tracked in issue #487 (text_to_image, image_to_text, text_to_speech): it takes audio as input and produces a text transcript, which is the reverse modality direction from text_to_speech and involves a different class of execution.
What should be instrumented
Calls to InferenceClient.automatic_speech_recognition(audio, ...) should produce a span capturing:
| Span field |
Content |
| input |
Audio file path, bytes, or URL |
| output |
text field from SpeechToTextOutput |
| metadata |
provider: "huggingface_hub", model (from model param), task-specific params |
| metrics |
Latency (time_to_first_token), response timing |
Both sync and async variants need patchers, consistent with the existing pattern in py/src/braintrust/integrations/huggingface_hub/patchers.py.
Braintrust docs status
not_found — The Braintrust HuggingFace Hub integration docs describe tracing for chat_completion, text_generation, and embeddings. No mention of audio transcription or ASR.
Upstream sources
huggingface_hub.InferenceClient.automatic_speech_recognition — sync method, available since huggingface-hub ≥ 0.15; current matrix version 1.17.0
huggingface_hub.inference._generated._async_client.AsyncInferenceClient.automatic_speech_recognition — async variant
- Returns
SpeechToTextOutput with text: str
- HuggingFace Hub inference API docs: https://huggingface.co/docs/huggingface_hub/package_reference/inference_client
Relationship to existing issues
Local files inspected
py/src/braintrust/integrations/huggingface_hub/patchers.py — defines 8 patchers (4 sync + 4 async) for chat_completion, text_generation, feature_extraction, sentence_similarity; no reference to automatic_speech_recognition
py/src/braintrust/integrations/huggingface_hub/tracing.py — wrapper implementations for the 4 covered tasks; no ASR wrapper
py/src/braintrust/integrations/huggingface_hub/integration.py — registers patchers; no ASR patcher
py/src/braintrust/integrations/huggingface_hub/test_huggingface_hub.py — no ASR test cases
py/src/braintrust/integrations/openai/patchers.py — TranscriptionsPatcher and AsyncTranscriptionsPatcher instrument audio.transcriptions.create() as the OpenAI analog
py/pyproject.toml — matrix version huggingface-hub==1.17.0
Summary
The
huggingface_hubintegration instruments fourInferenceClient/AsyncInferenceClienttask methods —chat_completion,text_generation,feature_extraction, andsentence_similarity— but does not instrumentautomatic_speech_recognition(). This method generates a text transcript from an audio input and is a stable, production generative execution surface in the HuggingFace Hub SDK.The OpenAI integration already instruments the equivalent surface (
audio.transcriptions.create()viaTranscriptionsPatcher), making this an asymmetry across providers.What is missing
InferenceClient.chat_completion()InferenceClient.text_generation()InferenceClient.feature_extraction()InferenceClient.sentence_similarity()InferenceClient.automatic_speech_recognition()AsyncInferenceClient.automatic_speech_recognition()automatic_speech_recognition()is distinct from the three tasks tracked in issue #487 (text_to_image,image_to_text,text_to_speech): it takes audio as input and produces a text transcript, which is the reverse modality direction fromtext_to_speechand involves a different class of execution.What should be instrumented
Calls to
InferenceClient.automatic_speech_recognition(audio, ...)should produce a span capturing:textfield fromSpeechToTextOutputprovider: "huggingface_hub",model(frommodelparam), task-specific paramstime_to_first_token), response timingBoth sync and async variants need patchers, consistent with the existing pattern in
py/src/braintrust/integrations/huggingface_hub/patchers.py.Braintrust docs status
not_found — The Braintrust HuggingFace Hub integration docs describe tracing for
chat_completion,text_generation, and embeddings. No mention of audio transcription or ASR.Upstream sources
huggingface_hub.InferenceClient.automatic_speech_recognition— sync method, available since huggingface-hub ≥ 0.15; current matrix version1.17.0huggingface_hub.inference._generated._async_client.AsyncInferenceClient.automatic_speech_recognition— async variantSpeechToTextOutputwithtext: strRelationship to existing issues
text_to_image,image_to_text, andtext_to_speech— a different set of generative media tasks on the same client. This issue covers the audio-to-text transcription surface, which is absent from [bot] HuggingFace Hub: InferenceClient generative media tasks (text_to_image, image_to_text, text_to_speech) not instrumented #487's scope.audio.transcriptions.create()viaTranscriptionsPatcheras the direct analog.Local files inspected
py/src/braintrust/integrations/huggingface_hub/patchers.py— defines 8 patchers (4 sync + 4 async) forchat_completion,text_generation,feature_extraction,sentence_similarity; no reference toautomatic_speech_recognitionpy/src/braintrust/integrations/huggingface_hub/tracing.py— wrapper implementations for the 4 covered tasks; no ASR wrapperpy/src/braintrust/integrations/huggingface_hub/integration.py— registers patchers; no ASR patcherpy/src/braintrust/integrations/huggingface_hub/test_huggingface_hub.py— no ASR test casespy/src/braintrust/integrations/openai/patchers.py—TranscriptionsPatcherandAsyncTranscriptionsPatcherinstrumentaudio.transcriptions.create()as the OpenAI analogpy/pyproject.toml— matrix versionhuggingface-hub==1.17.0