Key Phrase Extraction with Azure Cognitive Services in C#
Using Artificial Intelligence to quickly summarize documents
Key phrase extraction is an artificial intelligence technology that lets you summarize larger portions of text down to key phrases. Key phrase extraction can help readers understand search results or quickly understand a larger body of work. Azure Cognitive Services includes key phrase extraction as a capability of its language services. In this article we’ll explore how to interact with key phrase extraction using C# code and the Azure SDK or by using REST requests.
In order to get the most out of this article you should be familiar with:
- The basics of Azure Cognitive Services
- The basics of C# programming
- How to add a package reference using NuGet
Adding a Reference to the Azure Cognitive Services Language SDK
In order to summarize text, we need to reference the Azure SDK. To get this we’ll install the latest version of the Azure.AI.TextAnalytics
package in Visual Studio using NuGet package manager.
Caution: do not use
Microsoft.Azure.CognitiveServices.Language.TextAnalytics
. This is the old version of this library and it has some known bugs.
See Microsoft’s documentation on NuGet package manager for additional instructions on adding a package reference.
Note: Adding the package can also be done in the .NET CLI with
dotnet add package Azure.AI.TextAnalytics
.
Creating a TextAnalyticsClient Instance
Next, we’ll add some using
statements to the top of our C# file:
using Azure;
using Azure.AI.TextAnalytics;
These allow us to use classes in the TextAnalytics
namespace.
After that, we’ll store key and endpoint. These can be found on the Keys and Endpoints blade of your cognitive services instance in the Azure portal.
If you are using a single Azure Cognitive Services instance, you should use one of that service’s keys and its endpoint. If you wanted an isolated service and created a stand-alone Language Service, you would use that service’s key and endpoint instead.
// These values should come from a config file and should NOT be stored in source control
string key = "YourKeyGoesHere";
Uri endpoint = new Uri("https://YourCogServicesUrl.cognitiveservices.azure.com/");
Important Security Note: In a real application you should not hard-code your cognitive services key in your source code or check it into source control. Instead, you should get this key from a non-versioned configuration file via
IConfiguration
or similar mechanisms. Checking in keys can lead to people discovering sensitive credentials via your current source or your source control history and potentially using them to perform their own analysis at your expense.
Next, we’ll set up a TextAnalyticsClient
. This object will handle all communications with Azure Cognitive Services later on.
// Create the TextAnalyticsClient and set its endpoint
AzureKeyCredential credentials = new AzureKeyCredential(key);
TextAnalyticsClient textClient = new TextAnalyticsClient(endpoint, credentials);
With the textClient
created and configured, we’re ready to start summarizing text.
Key Phrase Extraction
Let’s start out with some text we want to summarize. Normally this would be text from the user or a file, but for this demo let’s get meta and use the opening paragraph of this article instead:
const string text = "Key phrase extraction is an artificial intelligence technology that " +
"lets you summarize larger portions of text down to key phrases. " +
"Key phrase extraction can help readers understand search results or " +
"quickly understand a larger body of work. Azure Cognitive Services " +
"includes key phrase extraction as a capability of its language services. " +
"In this article we'll explore how to interact with key phrase extraction " +
"using C# code and the Azure SDK or by using REST requests.";
Next, we’ll run the following C# code to extract key phrases:
// Detect Key Phrases
Response<KeyPhraseCollection> keyPhrasesResponse = textClient.ExtractKeyPhrases(text);
This will send a REST POST request to https://YourCogServicesUrl.cognitiveservices.azure.com/text/analytics/v3.1/keyPhrases
with the following body:
{
"documents":[
{
"id":"0",
"text":"Key phrase extraction is an artificial intelligence technology that lets you summarize larger portions of text down to key phrases. Key phrase extraction can help readers understand search results or quickly understand a larger body of work. Azure Cognitive Services includes key phrase extraction as a capability of its language services. In this article we\u0027ll explore how to interact with key phrase extraction using C# code and the Azure SDK or by using REST requests.",
"language":"en"
}
]
}
Note that the language
defaulted to “en” for English. If you wanted to, you could have specified a different language code as a parameter to AnalyzeSentiment
. Of course, you could also use the language detection features of cognitive services to provide this based on a prior call to the cognitive services API.
Microsoft also gives us an ExtractKeyPhrasesBatch
method that allows you to specify multiple strings instead of one string. Additionally, there are Async
versions of both ExtractKeyPhrases
and ExtractKeyPhrasesBatch
.
Another important note is that this REST request includes an Ocp-Apim-Subscription-Key
header containing your cognitive services key. This is how Microsoft knows that you’re allowed to make the request to that cognitive services endpoint. If you fail to include this header or use the wrong key, your call will not succeed.
Understanding Key Phrase Results
Let’s take a look at the JSON response we got to the REST request we made earlier:
{
"documents":[
{
"id":"0",
"keyPhrases":[
"artificial intelligence technology",
"Key phrase extraction",
"Azure Cognitive Services",
"key phrases",
"language services",
"Azure SDK",
"larger portions",
"search results",
"larger body",
"C# code",
"REST requests",
"text",
"readers",
"work",
"capability",
"article"
],
"warnings":[]
}
],
"errors":[],
"modelVersion":"2022-07-01"
}
Compared to some of the other cognitive services, this doesn’t seem like a complicated response. The information coming back focuses on an array of key phrase strings, and very little additional information.
We don’t need to worry about the request or response bodies if we are working with C# and the Azure SDK. Instead, we can just loop over the key phrases as shown below:
KeyPhraseCollection keyPhrases = keyPhrasesResponse.Value;
Console.WriteLine("Key Phrases:");
foreach (string phrase in keyPhrases)
{
Console.WriteLine($"\t{phrase}");
}
This would display the following text:
Key Phrases:
artificial intelligence technology
Key phrase extraction
Azure Cognitive Services
key phrases
language services
Azure SDK
larger portions
search results
larger body
C# code
REST requests
text
readers
work
capability
article
While this doesn’t give us a ton of information about the intent of the paragraph, it does tell us the things that were mentioned in the paragraph, which can be valuable for highlighting, search results, or determining if a specific key phrase of interest is present.
Key phrase extraction is just one of the many different offerings available in Azure Cognitive Services, but it can help propel the right application to greater capabilities by allowing you to focus on writing your app and leaving the text analysis to Azure.