I am working on one project where my data (approx.300K+ .doc files) is in Blob storage and all files are in standard format. I am using Cognitive AI Search to get right files using Azure Open AI.Below code works fine for me but it is not giving me results more than 5 files whereas there are 1000s of files available.
Below is my code.
var payload = new
{
dataSources = new[]
{
new
{
type = "AzureCognitiveSearch",
parameters = new
{
endpoint = azureSearchEndpoint,
key = azureSearchKey,
indexName = azureSearchIndex,
top = 20
}
}
},
messages = new[]
{
new
{
role = "user",
content = userMessage
}
},
max_tokens = 2980
};
// Create an HttpClient instance
using (HttpClient client = new HttpClient())
{
// Set the request headers
client.DefaultRequestHeaders.Add("api-key", oaiKey);
// Serialize the payload
string serializedPayload = JsonConvert.SerializeObject(payload);
// Create the request content
StringContent cont = new StringContent(serializedPayload, System.Text.Encoding.UTF8, "application/json");
await Task.Delay(10000);
// Make the POST request
HttpResponseMessage response = await client.PostAsync($"{oaiEndpoint}/openai/deployments/{oaiDeploymentName}/extensions/chat/completions?api-version=2023-06-01-preview", cont);
// Read the response content
string responseContent = await response.Content.ReadAsStringAsync();
JObject parsedJson = JObject.Parse(responseContent);
resContent = (string)parsedJson["choices"]?[0]?["messages"]?[1]?["content"];
I tried to add batching, also added multiple request in loop to get more results and combine them. Changing parameters like max token and top but no luck!
I am working on one project where my data (approx.300K+ .doc files) is in Blob storage and all files are in standard format. I am using Cognitive AI Search to get right files using Azure Open AI.Below code works fine for me but it is not giving me results more than 5 files whereas there are 1000s of files available.
Below is my code.
var payload = new
{
dataSources = new[]
{
new
{
type = "AzureCognitiveSearch",
parameters = new
{
endpoint = azureSearchEndpoint,
key = azureSearchKey,
indexName = azureSearchIndex,
top = 20
}
}
},
messages = new[]
{
new
{
role = "user",
content = userMessage
}
},
max_tokens = 2980
};
// Create an HttpClient instance
using (HttpClient client = new HttpClient())
{
// Set the request headers
client.DefaultRequestHeaders.Add("api-key", oaiKey);
// Serialize the payload
string serializedPayload = JsonConvert.SerializeObject(payload);
// Create the request content
StringContent cont = new StringContent(serializedPayload, System.Text.Encoding.UTF8, "application/json");
await Task.Delay(10000);
// Make the POST request
HttpResponseMessage response = await client.PostAsync($"{oaiEndpoint}/openai/deployments/{oaiDeploymentName}/extensions/chat/completions?api-version=2023-06-01-preview", cont);
// Read the response content
string responseContent = await response.Content.ReadAsStringAsync();
JObject parsedJson = JObject.Parse(responseContent);
resContent = (string)parsedJson["choices"]?[0]?["messages"]?[1]?["content"];
I tried to add batching, also added multiple request in loop to get more results and combine them. Changing parameters like max token and top but no luck!
Share Improve this question edited Jan 20 at 11:08 vv_Coder asked Jan 19 at 17:13 vv_Codervv_Coder 112 bronze badges 6 | Show 1 more comment2 Answers
Reset to default 0in the chat completion
response, the reference/citations you see is not necessarily all the documents being pull back from ai search. currently, the citations list will only return 1-5 references, you can't configure the size.
https://learn.microsoft.com/en-us/answers/questions/1368600/increase-number-of-citations-for-azure-openai-serv
i think you question is more about how to make sure the ai search's result covers the question well. if so, you can use the same prompt in azure portal to see how many matched records. Or, maybe turn on the trace in code.
Use the latest api version and changes made here
- The API path is changed from
/extensions/chat/completions
to/chat/completions
.- The naming convention of property keys and enum values is changed from camel casing to snake casing. Example:
deploymentName
is changed todeployment_name
.- The data source type
AzureCognitiveSearch
is changed toazure_search
.
You can use below code.
Default limit is 5 documents, and you can set it with top_n_documents
documents.
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
public class Program
{
public static async Task Main(string[] args)
{
string azureSearchEndpoint = "https://<endpoint_name>.search.windows.net";
string azureSearchKey = "key";
string azureSearchIndex = "azureblob-index";
string oaiEndpoint = "https://<openai_name>.openai.azure.com/openai/deployments/<deployment_name>/chat/completions?api-version=2024-02-01";
string oaiKey = "key";
string userMessage = "Take details of only csv files"; // Update as per your input
var payload = new
{
data_sources = new[]
{
new
{
type = "azure_search",
parameters = new
{
endpoint = azureSearchEndpoint,
authentication=new {
type = "api_key",
key = azureSearchKey,
},
index_name = azureSearchIndex,
top_n_documents=20
}
}
},
messages = new[]
{
new
{
role = "user",
content = userMessage
}
},
max_tokens = 2980
};
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Add("api-key", oaiKey);
string serializedPayload = JsonConvert.SerializeObject(payload);
StringContent content = new StringContent(serializedPayload, Encoding.UTF8, "application/json");
HttpResponseMessage response = await client.PostAsync(
oaiEndpoint,
content);
if (response.IsSuccessStatusCode)
{
string responseContent = await response.Content.ReadAsStringAsync();
JObject parsedJson = JObject.Parse(responseContent);
JArray citations = (JArray)parsedJson["choices"]?[0]?["message"]?["context"]?["citations"];
int citationLength = citations?.Count ?? 0;
Console.WriteLine($"Number of citations: {citationLength}");
//}
}
else
{
Console.WriteLine($"Error: {response.StatusCode} - {await response.Content.ReadAsStringAsync()}");
// Stop on error
}
}
}
}
Output:
Here, i given top 6 documents and it used those documents in citations, similarly you give in your case.
Next, given 10 top n documents and got 10 citations.
Note: Make sure your model accepts more tokens before you increase the top n documents.
parsedJson["choices"]?[0]?["messages"]["context"]
– JayashankarGS Commented Jan 20 at 6:36