Enable File Search: Implement /v1/files & Vector Store APIs
Hello Maintainers,
We extend our sincere gratitude for your dedicated efforts on the new API project. Your work is invaluable to developers like us who depend on this gateway to support a multitude of LLM providers. The addition of the /v1/responses relay is particularly appreciated. However, we've identified a crucial gap that hinders our ability to fully leverage one of the most powerful tools within the Responses API: file search.
The Importance of File Search in the Responses API
File search is a cornerstone feature of the Responses API, as highlighted in OpenAI’s official documentation. It allows models to retrieve context directly from user-provided documents, significantly enhancing their ability to provide accurate and relevant responses. To fully utilize this feature, developers are expected to upload documents via the Files endpoint and subsequently pass the resulting file_id into the Responses call. This workflow is crucial for applications requiring models to interact with and extract information from specific documents.
According to OpenAI’s official guidance, the Responses API expects developers to upload documents via the Files endpoint and then pass the resulting file_id into the Responses call. This workflow is what enables a model to retrieve context from user‑provided documents. Tech blogs and guides also note that file search is a first‑class feature of the Responses API, providing enhanced document retrieval and metadata filtering compared to earlier implementations. The ability to filter by metadata further refines the search process, enabling users to pinpoint specific information within their documents. This level of precision is vital for knowledge-intensive tasks and applications requiring a deep understanding of context.
Currently, the new API lacks a relay for /v1/files and related vector store management endpoints. This absence compels users to bypass the gateway entirely, interacting directly with OpenAI’s API for file uploads and vector store creation. This workaround not only disrupts the unified gateway experience that the new API aims to provide but also introduces complexities in billing, logging, and feature parity. The lack of native file handling capabilities undermines the core value proposition of the new API, particularly as more developers transition to the Responses API. This forces users to leave the gateway for crucial file operations.
The Benefits of Implementing /v1/files and Vector Store Endpoints
Implementing /v1/files (create, list, retrieve, delete, get content) and vector store management (create, add file, delete, list) would unlock significant benefits for developers and the new API platform:
1. Full Utilization of the Responses API Toolset
By implementing /v1/files and vector store endpoints, we unlock the full potential of the Responses API. File search is indispensable for agent-like workflows, retrieval-augmented generation (RAG), and knowledge-base applications. The absence of native file endpoints renders the Responses API functionally incomplete within the new API ecosystem. This feature is critical for a wide range of use cases, from customer service chatbots that can access product manuals to research assistants that can analyze scientific papers. Without native file handling capabilities, the new API is limited in its ability to support these advanced applications.
Consider the scenario of a customer support chatbot. With file search capabilities, the bot can quickly retrieve relevant information from product manuals and FAQs, providing accurate and timely answers to customer queries. In contrast, without this feature, the chatbot would be limited to its pre-programmed knowledge, potentially leading to inaccurate or incomplete responses. Similarly, in a research setting, file search allows researchers to efficiently analyze large volumes of scientific literature, identifying key findings and trends. By enabling these capabilities, the new API can empower developers to build more intelligent and responsive applications.
2. Enhanced Developer Experience
A unified developer experience is paramount. Implementing these endpoints allows developers to manage all API calls within the new API, streamlining authentication, logging, and usage tracking. This seamless integration aligns with OpenAI’s recommended workflows, making the new API a more intuitive and efficient platform for developers. Developers can keep all calls within new‑api, simplifying authentication, logging and usage tracking. This also aligns new‑api with OpenAI’s recommended workflows. The result is a cohesive and user-friendly environment that encourages experimentation and innovation.
Imagine a developer working on a complex application that requires both text generation and file processing. Without native file handling capabilities, the developer would need to manage two separate API integrations, one for the new API and another for OpenAI’s API. This adds unnecessary complexity and overhead to the development process. By integrating file handling capabilities directly into the new API, developers can streamline their workflow and focus on building their application rather than managing multiple API integrations. This unified approach not only saves time and effort but also reduces the potential for errors and inconsistencies.
3. Future-Proofing the Platform
With OpenAI planning to phase out the Assistants API in favor of the Responses API by mid-2026, supporting the full spectrum of Responses features now is crucial. By embracing these changes, the new API will become a more competitive gateway as the ecosystem evolves. This proactive approach ensures that the new API remains a relevant and valuable tool for developers in the long term. Supporting the full spectrum of Responses features now will make new‑api a competitive gateway as the ecosystem evolves. By staying ahead of the curve, the new API can attract a wider range of users and solidify its position as a leading LLM gateway.
This transition from the Assistants API to the Responses API represents a significant shift in the landscape of AI development. By fully supporting the Responses API, the new API can help developers adapt to these changes and take advantage of the latest advancements in the field. This includes not only file search but also other key features such as function calling and streaming responses. By providing a comprehensive set of tools and capabilities, the new API can empower developers to build cutting-edge applications that leverage the full potential of LLMs.
4. Unified Billing and Governance
Handling file uploads and vector stores through the new API enables centralized enforcement of quotas, tracking of storage usage, and consistent application of security policies across providers. This unified approach simplifies management and ensures a secure and reliable platform for all users. Unified billing and governance: Handling file uploads and vector stores through new‑api allows you to enforce quotas, track storage usage and apply security policies consistently across providers. This centralized control is essential for maintaining the integrity and stability of the platform.
Consider the challenges of managing billing and security across multiple LLM providers. Without a unified system, developers would need to track usage and enforce policies separately for each provider, which can be a complex and time-consuming task. By handling file uploads and vector stores through the new API, these processes can be streamlined and automated. This not only saves time and effort but also reduces the risk of errors and inconsistencies. Furthermore, a unified system allows for better visibility into resource usage, enabling administrators to optimize costs and ensure that resources are allocated efficiently.
Addressing Implementation Considerations
We recognize that implementing multipart upload, vector store creation, and streaming file content introduces new challenges, such as managing size limits, authentication, and provider differences. However, even a minimal proxy implementation that forwards requests to the underlying provider would be incredibly valuable for the community. This initial step would allow us to test file search within the new API environment without resorting to direct OpenAI calls. It would allow us to test file search in new‑api without resorting to direct OpenAI calls.
This phased approach would allow the new API to gradually incorporate these advanced features while providing immediate value to developers. A minimal proxy implementation would serve as a foundation for future enhancements, allowing the platform to evolve and adapt to the changing needs of the community. This iterative approach is often the most effective way to build complex systems, as it allows for continuous feedback and improvement.
Conclusion: A Call to Action
In conclusion, implementing /v1/files and vector store endpoints is a critical step in unlocking the full potential of the Responses API within the new API ecosystem. This enhancement would significantly improve the developer experience, future-proof the platform, and ensure unified billing and governance. We urge you to consider this enhancement, as it would have a profound impact on the usability and adoption of the new API as a next-generation LLM gateway.
Thank you for considering this enhancement. It would have a deep impact on the usability and adoption of new‑api as a next‑gen LLM gateway.
For more information on the OpenAI Responses API and its capabilities, we recommend visiting the official OpenAI documentation and other trusted resources such as OpenAI API Documentation. This will provide you with a comprehensive understanding of the API and its potential applications.