最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

c# - Cognitive AI Search in Blob Storage using Azure Open AI - Stack Overflow

programmeradmin2浏览0评论

I am working on one project where my data (approx.300K+ .doc files) is in Blob storage and all files are in standard format. I am using Cognitive AI Search to get right files using Azure Open AI.Below code works fine for me but it is not giving me results more than 5 files whereas there are 1000s of files available.


Below is my code.

 var payload = new
 {
     dataSources = new[]
     {
     new
     {
         type = "AzureCognitiveSearch",
         parameters = new
         {
             endpoint = azureSearchEndpoint,
             key = azureSearchKey,
             indexName = azureSearchIndex,
             top = 20
         }
     }
 },
     messages = new[]
 {
     new
     {
         role = "user",
         content = userMessage
     }
 },
     max_tokens = 2980
 };

 // Create an HttpClient instance
 using (HttpClient client = new HttpClient())
 {
     // Set the request headers
     client.DefaultRequestHeaders.Add("api-key", oaiKey);

     // Serialize the payload
     string serializedPayload = JsonConvert.SerializeObject(payload);

     // Create the request content
     StringContent cont = new StringContent(serializedPayload, System.Text.Encoding.UTF8, "application/json");
     await Task.Delay(10000);
     // Make the POST request
     HttpResponseMessage response = await client.PostAsync($"{oaiEndpoint}/openai/deployments/{oaiDeploymentName}/extensions/chat/completions?api-version=2023-06-01-preview", cont);

     // Read the response content

     string responseContent = await response.Content.ReadAsStringAsync();
     JObject parsedJson = JObject.Parse(responseContent);
     resContent = (string)parsedJson["choices"]?[0]?["messages"]?[1]?["content"];

I tried to add batching, also added multiple request in loop to get more results and combine them. Changing parameters like max token and top but no luck!

I am working on one project where my data (approx.300K+ .doc files) is in Blob storage and all files are in standard format. I am using Cognitive AI Search to get right files using Azure Open AI.Below code works fine for me but it is not giving me results more than 5 files whereas there are 1000s of files available.


Below is my code.

 var payload = new
 {
     dataSources = new[]
     {
     new
     {
         type = "AzureCognitiveSearch",
         parameters = new
         {
             endpoint = azureSearchEndpoint,
             key = azureSearchKey,
             indexName = azureSearchIndex,
             top = 20
         }
     }
 },
     messages = new[]
 {
     new
     {
         role = "user",
         content = userMessage
     }
 },
     max_tokens = 2980
 };

 // Create an HttpClient instance
 using (HttpClient client = new HttpClient())
 {
     // Set the request headers
     client.DefaultRequestHeaders.Add("api-key", oaiKey);

     // Serialize the payload
     string serializedPayload = JsonConvert.SerializeObject(payload);

     // Create the request content
     StringContent cont = new StringContent(serializedPayload, System.Text.Encoding.UTF8, "application/json");
     await Task.Delay(10000);
     // Make the POST request
     HttpResponseMessage response = await client.PostAsync($"{oaiEndpoint}/openai/deployments/{oaiDeploymentName}/extensions/chat/completions?api-version=2023-06-01-preview", cont);

     // Read the response content

     string responseContent = await response.Content.ReadAsStringAsync();
     JObject parsedJson = JObject.Parse(responseContent);
     resContent = (string)parsedJson["choices"]?[0]?["messages"]?[1]?["content"];

I tried to add batching, also added multiple request in loop to get more results and combine them. Changing parameters like max token and top but no luck!

Share Improve this question edited Jan 20 at 11:08 vv_Coder asked Jan 19 at 17:13 vv_Codervv_Coder 112 bronze badges 6
  • Can you share the full code with filter? – Venkatesan Commented Jan 20 at 4:33
  • What are you expecting? add details on what you are expecting results? – JayashankarGS Commented Jan 20 at 5:21
  • only matching results you will get. – JayashankarGS Commented Jan 20 at 5:23
  • check the files in context parsedJson["choices"]?[0]?["messages"]["context"] – JayashankarGS Commented Jan 20 at 6:36
  • Hello @Venkatesan, you can ignore the filter. It does not give more than 5 results even without any filter – vv_Coder Commented Jan 20 at 11:11
 |  Show 1 more comment

2 Answers 2

Reset to default 0

in the chat completion response, the reference/citations you see is not necessarily all the documents being pull back from ai search. currently, the citations list will only return 1-5 references, you can't configure the size.

https://learn.microsoft.com/en-us/answers/questions/1368600/increase-number-of-citations-for-azure-openai-serv

i think you question is more about how to make sure the ai search's result covers the question well. if so, you can use the same prompt in azure portal to see how many matched records. Or, maybe turn on the trace in code.

Use the latest api version and changes made here

  • The API path is changed from /extensions/chat/completions to /chat/completions.
  • The naming convention of property keys and enum values is changed from camel casing to snake casing. Example: deploymentName is changed to deployment_name.
  • The data source type AzureCognitiveSearch is changed to azure_search.

You can use below code.

Default limit is 5 documents, and you can set it with top_n_documents documents.

using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
 
public class Program
{
    public static async Task Main(string[] args)
    {
        string azureSearchEndpoint = "https://<endpoint_name>.search.windows.net";
        string azureSearchKey = "key";
        string azureSearchIndex = "azureblob-index";
        string oaiEndpoint = "https://<openai_name>.openai.azure.com/openai/deployments/<deployment_name>/chat/completions?api-version=2024-02-01";
        string oaiKey = "key";
        string userMessage = "Take details of only csv files"; // Update as per your input
 
 
        var payload = new
            {
                data_sources = new[]
                {
                    new
                    {
                        type = "azure_search",
                        parameters = new
                        {
                            endpoint = azureSearchEndpoint,
                            authentication=new {
                                type = "api_key",
                                key = azureSearchKey,
                            },
                            index_name = azureSearchIndex,
                            top_n_documents=20
                        }
                    }
                },
                messages = new[]
                {
                    new
                    {
                        role = "user",
                        content = userMessage
                    }
                },
                max_tokens = 2980
            };
 
            using (HttpClient client = new HttpClient())
            {
                client.DefaultRequestHeaders.Add("api-key", oaiKey);
 
                string serializedPayload = JsonConvert.SerializeObject(payload);
                StringContent content = new StringContent(serializedPayload, Encoding.UTF8, "application/json");
 
                HttpResponseMessage response = await client.PostAsync(
                    oaiEndpoint,
                    content);
 
                if (response.IsSuccessStatusCode)
                {
                    string responseContent = await response.Content.ReadAsStringAsync();
                    JObject parsedJson = JObject.Parse(responseContent);
 
          
                JArray citations = (JArray)parsedJson["choices"]?[0]?["message"]?["context"]?["citations"];
 
         
                int citationLength = citations?.Count ?? 0;
 
                Console.WriteLine($"Number of citations: {citationLength}");
                //}
            }
                else
                {
                    Console.WriteLine($"Error: {response.StatusCode} - {await response.Content.ReadAsStringAsync()}");
                // Stop on error
                }
            }

    }
}

Output:

Here, i given top 6 documents and it used those documents in citations, similarly you give in your case.

Next, given 10 top n documents and got 10 citations.

Note: Make sure your model accepts more tokens before you increase the top n documents.

发布评论

评论列表(0)

  1. 暂无评论