Skip to content

[bot] HuggingFace Hub: InferenceClient.automatic_speech_recognition() not instrumented #506

@braintrust-bot

Description

@braintrust-bot

Summary

The huggingface_hub integration instruments four InferenceClient / AsyncInferenceClient task methods — chat_completion, text_generation, feature_extraction, and sentence_similarity — but does not instrument automatic_speech_recognition(). This method generates a text transcript from an audio input and is a stable, production generative execution surface in the HuggingFace Hub SDK.

The OpenAI integration already instruments the equivalent surface (audio.transcriptions.create() via TranscriptionsPatcher), making this an asymmetry across providers.

What is missing

Method Task Instrumented?
InferenceClient.chat_completion() Chat generation Yes
InferenceClient.text_generation() Text generation Yes
InferenceClient.feature_extraction() Embeddings Yes
InferenceClient.sentence_similarity() Semantic similarity Yes
InferenceClient.automatic_speech_recognition() Audio → text transcription No
AsyncInferenceClient.automatic_speech_recognition() Audio → text transcription (async) No

automatic_speech_recognition() is distinct from the three tasks tracked in issue #487 (text_to_image, image_to_text, text_to_speech): it takes audio as input and produces a text transcript, which is the reverse modality direction from text_to_speech and involves a different class of execution.

What should be instrumented

Calls to InferenceClient.automatic_speech_recognition(audio, ...) should produce a span capturing:

Span field Content
input Audio file path, bytes, or URL
output text field from SpeechToTextOutput
metadata provider: "huggingface_hub", model (from model param), task-specific params
metrics Latency (time_to_first_token), response timing

Both sync and async variants need patchers, consistent with the existing pattern in py/src/braintrust/integrations/huggingface_hub/patchers.py.

Braintrust docs status

not_found — The Braintrust HuggingFace Hub integration docs describe tracing for chat_completion, text_generation, and embeddings. No mention of audio transcription or ASR.

Upstream sources

  • huggingface_hub.InferenceClient.automatic_speech_recognition — sync method, available since huggingface-hub ≥ 0.15; current matrix version 1.17.0
  • huggingface_hub.inference._generated._async_client.AsyncInferenceClient.automatic_speech_recognition — async variant
  • Returns SpeechToTextOutput with text: str
  • HuggingFace Hub inference API docs: https://huggingface.co/docs/huggingface_hub/package_reference/inference_client

Relationship to existing issues

Local files inspected

  • py/src/braintrust/integrations/huggingface_hub/patchers.py — defines 8 patchers (4 sync + 4 async) for chat_completion, text_generation, feature_extraction, sentence_similarity; no reference to automatic_speech_recognition
  • py/src/braintrust/integrations/huggingface_hub/tracing.py — wrapper implementations for the 4 covered tasks; no ASR wrapper
  • py/src/braintrust/integrations/huggingface_hub/integration.py — registers patchers; no ASR patcher
  • py/src/braintrust/integrations/huggingface_hub/test_huggingface_hub.py — no ASR test cases
  • py/src/braintrust/integrations/openai/patchers.pyTranscriptionsPatcher and AsyncTranscriptionsPatcher instrument audio.transcriptions.create() as the OpenAI analog
  • py/pyproject.toml — matrix version huggingface-hub==1.17.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions