Skip to content

Commit 805c5f1

Browse files
authored
Merge pull request #2 from AssemblyAI/support-input
Support INPUT variable
2 parents e921fce + bec677e commit 805c5f1

File tree

5 files changed

+197
-87
lines changed

5 files changed

+197
-87
lines changed

README.md

Lines changed: 49 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
[![AssemblyAI Twitter](https://img.shields.io/twitter/follow/AssemblyAI?label=%40AssemblyAI&style=social "AssemblyAI Twitter")](https://twitter.com/AssemblyAI)
88
[![AssemblyAI YouTube](https://img.shields.io/youtube/channel/subscribers/UCtatfZMf-8EkIwASXM4ts0A "AssemblyAI YouTube")](https://www.youtube.com/@AssemblyAI)
99

10-
# AssemblyAI plugins for Semantic Kernel
10+
# AssemblyAI integration for Semantic Kernel
1111

1212
Transcribe audio using AssemblyAI with Semantic Kernel plugins.
1313

@@ -35,21 +35,19 @@ string apiKey = Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY")
3535

3636
var transcriptPlugin = kernel.ImportSkill(
3737
new TranscriptPlugin(apiKey: apiKey),
38-
"TranscriptPlugin"
38+
TranscriptPlugin.PluginName
3939
);
4040
```
4141

42+
## Usage
43+
4244
Get the `Transcribe` function from the transcript plugin and invoke it with the context variables.
4345
```csharp
44-
var variables = new ContextVariables
45-
{
46-
["audioUrl"] = "https://storage.googleapis.com/aai-docs-samples/espn.m4a"
47-
};
48-
49-
var context = await kernel.Skills
50-
.GetFunction("TranscriptPlugin", "Transcribe")
51-
.InvokeAsync(variables);
52-
46+
var function = kernel.Skills
47+
.GetFunction(TranscriptPlugin.PluginName, TranscriptPlugin.TranscribeFunctionName);
48+
var context = kernel.CreateNewContext();
49+
context.Variables["audioUrl"] = "https://storage.googleapis.com/aai-docs-samples/espn.m4a";
50+
await function.InvokeAsync(context);
5351
Console.WriteLine(context.Result);
5452
```
5553

@@ -65,22 +63,50 @@ var transcriptPlugin = kernel.ImportSkill(
6563
{
6664
AllowFileSystemAccess = true
6765
},
68-
"TranscriptPlugin"
66+
TranscriptPlugin.PluginName
6967
);
68+
var function = kernel.Skills
69+
.GetFunction(TranscriptPlugin.PluginName, TranscriptPlugin.TranscribeFunctionName);
70+
var context = kernel.CreateNewContext();
71+
context.Variables["filePath"] = "./espn.m4a";
72+
await function.InvokeAsync(context);
73+
Console.WriteLine(context.Result);
74+
```
7075

71-
var variables = new ContextVariables
72-
{
73-
["filePath"] = "./espn.m4a"
74-
};
76+
If `filePath` and `audioUrl` are specified, the `filePath` will be used to upload the file and `audioUrl` will be overridden.
7577

76-
var context = await kernel.Skills
77-
.GetFunction("TranscriptPlugin", "Transcribe")
78-
.InvokeAsync(variables);
79-
78+
Lastly, you can also use the `INPUT` variable, so you can transcribe a file like this.
79+
80+
```csharp
81+
var function = kernel.Skills
82+
.GetFunction(TranscriptPlugin.PluginName, TranscriptPlugin.TranscribeFunctionName);
83+
var context = await function.InvokeAsync("./espn.m4a");
84+
```
85+
86+
Or from within a semantic function like this.
87+
88+
```csharp
89+
var prompt = """
90+
Here is a transcript:
91+
{{TranscriptPlugin.Transcribe "https://storage.googleapis.com/aai-docs-samples/espn.m4a"}}
92+
---
93+
Summarize the transcript.
94+
""";
95+
var context = kernel.CreateNewContext();
96+
var function = kernel.CreateSemanticFunction(prompt);
97+
await function.InvokeAsync(context);
8098
Console.WriteLine(context.Result);
8199
```
82100

83-
If `filePath` and `audioUrl` are specified, the `filePath` will be used to upload the file and `audioUrl` will be overridden.
101+
If the `INPUT` variable is a URL, it'll be used as the `audioUrl`, otherwise, it'll be used as the `filePath`.
102+
If either `audioUrl` or `filePath` are configured, `INPUT` is ignored.
103+
104+
All the code above explicitly invokes the transcript plugin, but it can also be invoked as part of a plan.
105+
Check out [the Sample project](./src/Sample/Program.cs#L50) which uses a plan to transcribe an audio file in addition to explicit invocation.
106+
107+
## Notes
84108

85-
The code above explicitly invokes the transcript plugin, but it can also be invoked as part of a plan.
86-
Check out [the Sample project](./src/Sample/Program.cs#L54) which uses a plan to transcribe an audio file in addition to explicit invocation.
109+
- The AssemblyAI integration only supports Semantic Kernel with .NET at this moment.
110+
If there's demand, we will extend support to other platforms, so let us know!
111+
- Semantic Kernel itself is still in pre-release, and changes frequently, so we'll keep our integration in pre-release until SK is GA'd.
112+
- Feel free to [file an issue](https://github.com/AssemblyAI/assemblyai-semantic-kernel/issues) in case of bugs or feature requests.

src/AssemblyAI.SemanticKernel/AssemblyAI.SemanticKernel.csproj

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@
1111
<PackageTags>SemanticKernel;AI;AssemblyAI;transcript</PackageTags>
1212
<Company>AssemblyAI</Company>
1313
<Product>AssemblyAI</Product>
14-
<AssemblyVersion>1.0.1.0</AssemblyVersion>
15-
<FileVersion>1.0.1.0</FileVersion>
16-
<PackageVersion>1.0.1-alpha</PackageVersion>
14+
<AssemblyVersion>1.0.2.0</AssemblyVersion>
15+
<FileVersion>1.0.2.0</FileVersion>
16+
<PackageVersion>1.0.2-alpha</PackageVersion>
1717
<OutputType>Library</OutputType>
1818
<PackageLicenseExpression>MIT</PackageLicenseExpression>
1919
<PackageProjectUrl>https://github.com/AssemblyAI/assemblyai-semantic-kernel</PackageProjectUrl>

src/AssemblyAI.SemanticKernel/TranscriptPlugin.cs

Lines changed: 48 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -30,16 +30,15 @@ public TranscriptPlugin(string apiKey)
3030
If filePath is configured, the file will be uploaded to AssemblyAI, and then used as the audioUrl to transcribe.
3131
Optional if audioUrl is configured. The uploaded file will override the audioUrl parameter.")]
3232
[SKParameter("audioUrl", @"The public URL of the audio or video file to transcribe.
33-
Optional if filePath is configured.
34-
""")]
33+
Optional if filePath is configured.")]
3534
public async Task<string> Transcribe(SKContext context)
3635
{
36+
SetPathAndUrl(context, out var filePath, out var audioUrl);
3737
using (var httpClient = new HttpClient())
3838
{
3939
httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue(_apiKey);
4040

41-
string audioUrl;
42-
if (context.Variables.TryGetValue("filePath", out var filePath))
41+
if (filePath != null)
4342
{
4443
if (AllowFileSystemAccess == false)
4544
{
@@ -50,16 +49,56 @@ public async Task<string> Transcribe(SKContext context)
5049

5150
audioUrl = await UploadFileAsync(filePath, httpClient);
5251
}
52+
53+
var transcript = await CreateTranscriptAsync(audioUrl, httpClient);
54+
transcript = await WaitForTranscriptToProcess(transcript, httpClient);
55+
return transcript.Text ?? throw new Exception("Transcript text is null. This should not happen.");
56+
}
57+
}
58+
59+
private static void SetPathAndUrl(SKContext context, out string filePath, out string audioUrl)
60+
{
61+
filePath = null;
62+
audioUrl = null;
63+
if (context.Variables.TryGetValue("filePath", out filePath))
64+
{
65+
return;
66+
}
67+
68+
if (context.Variables.TryGetValue("audioUrl", out audioUrl))
69+
{
70+
var uri = new Uri(audioUrl);
71+
if (uri.IsFile)
72+
{
73+
filePath = uri.LocalPath;
74+
audioUrl = null;
75+
}
5376
else
5477
{
55-
context.Variables.TryGetValue("audioUrl", out audioUrl);
78+
return;
5679
}
80+
}
5781

58-
if (audioUrl is null) throw new Exception("You have to pass in the filePath or audioUrl parameter.");
82+
context.Variables.TryGetValue("INPUT", out var input);
83+
if (input == null)
84+
{
85+
throw new Exception("You must pass in INPUT, filePath, or audioUrl parameter.");
86+
}
5987

60-
var transcript = await CreateTranscriptAsync(audioUrl, httpClient);
61-
transcript = await WaitForTranscriptToProcess(transcript, httpClient);
62-
return transcript.Text ?? throw new Exception("Transcript text is null. This should not happen.");
88+
if (Uri.TryCreate(input, UriKind.Absolute, out var inputUrl))
89+
{
90+
if (inputUrl.IsFile)
91+
{
92+
filePath = inputUrl.LocalPath;
93+
}
94+
else
95+
{
96+
audioUrl = input;
97+
}
98+
}
99+
else
100+
{
101+
filePath = input;
63102
}
64103
}
65104

src/Sample/FindFilePlugin.cs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ namespace AssemblyAI.SemanticKernel.Sample;
88

99
public class FindFilePlugin
1010
{
11+
public const string PluginName = "FindFilePlugin";
1112
private readonly IKernel _kernel;
1213

1314
public FindFilePlugin(IKernel kernel)
@@ -32,6 +33,9 @@ public FindFilePlugin(IKernel kernel)
3233
return matches.LastOrDefault()?.Value ?? null;
3334
}
3435

36+
37+
public const string LocateFileFunctionName = nameof(LocateFile);
38+
3539
[SKFunction, Description("Find files in common folders.")]
3640
[SKParameter("fileName", "The name of the file")]
3741
[SKParameter("commonFolderName", "The name of the common folder")]

src/Sample/Program.cs

Lines changed: 93 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,70 +1,111 @@
11
using System.Text.Json;
22
using Microsoft.Extensions.Configuration;
3+
using Microsoft.Extensions.Logging;
34
using Microsoft.SemanticKernel;
45
using Microsoft.SemanticKernel.Orchestration;
56
using Microsoft.SemanticKernel.Planning;
6-
using AssemblyAI.SemanticKernel;
7-
using AssemblyAI.SemanticKernel.Sample;
8-
using Microsoft.Extensions.Logging;
97

10-
var config = new ConfigurationBuilder()
11-
.AddEnvironmentVariables()
12-
.AddUserSecrets<Program>()
13-
.AddCommandLine(args)
14-
.Build();
15-
16-
using var loggerFactory = LoggerFactory.Create(builder => { builder.SetMinimumLevel(0); });
17-
var kernel = new KernelBuilder()
18-
.WithOpenAIChatCompletionService(
19-
"gpt-3.5-turbo",
20-
config["OpenAI:ApiKey"] ?? throw new Exception("OpenAI:ApiKey configuration is required.")
21-
)
22-
.WithLoggerFactory(loggerFactory)
23-
.Build();
24-
25-
var apiKey = config["AssemblyAI:ApiKey"] ?? throw new Exception("AssemblyAI:ApiKey configuration is required.");
26-
27-
var transcriptPlugin = kernel.ImportSkill(
28-
new TranscriptPlugin(apiKey: apiKey)
8+
namespace AssemblyAI.SemanticKernel.Sample;
9+
10+
internal class Program
11+
{
12+
public static async Task Main(string[] args)
2913
{
30-
AllowFileSystemAccess = true
31-
},
32-
TranscriptPlugin.PluginName
33-
);
14+
var config = BuildConfig(args);
3415

35-
await TranscribeFileUsingPlugin(kernel);
16+
var kernel = BuildKernel(config);
3617

37-
async Task TranscribeFileUsingPlugin(IKernel kernel)
38-
{
39-
var variables = new ContextVariables
18+
await TranscribeFileUsingPluginDirectly(kernel);
19+
//await TranscribeFileUsingPluginFromSemanticFunction(kernel);
20+
//await TranscribeFileUsingPlan(kernel);
21+
}
22+
23+
private static IKernel BuildKernel(IConfiguration config)
4024
{
41-
["audioUrl"] = "https://storage.googleapis.com/aai-docs-samples/espn.m4a",
42-
};
25+
var loggerFactory = LoggerFactory.Create(builder => { builder.SetMinimumLevel(0); });
26+
var kernel = new KernelBuilder()
27+
.WithOpenAIChatCompletionService(
28+
"gpt-3.5-turbo",
29+
config["OpenAI:ApiKey"] ?? throw new Exception("OpenAI:ApiKey configuration is required.")
30+
)
31+
.WithLoggerFactory(loggerFactory)
32+
.Build();
4333

44-
var result = await kernel.Skills
45-
.GetFunction(TranscriptPlugin.PluginName, TranscriptPlugin.TranscribeFunctionName)
46-
.InvokeAsync(variables);
47-
Console.WriteLine(result.Result);
48-
}
34+
var apiKey = config["AssemblyAI:ApiKey"] ?? throw new Exception("AssemblyAI:ApiKey configuration is required.");
4935

50-
var findFilePlugin = kernel.ImportSkill(
51-
new FindFilePlugin(kernel: kernel),
52-
"FindFilePlugin"
53-
);
36+
kernel.ImportSkill(
37+
new TranscriptPlugin(apiKey: apiKey)
38+
{
39+
AllowFileSystemAccess = true
40+
},
41+
TranscriptPlugin.PluginName
42+
);
5443

55-
await TranscribeFileUsingPlan(kernel);
44+
kernel.ImportSkill(
45+
new FindFilePlugin(kernel: kernel),
46+
FindFilePlugin.PluginName
47+
);
48+
return kernel;
49+
}
5650

57-
async Task TranscribeFileUsingPlan(IKernel kernel)
58-
{
59-
var planner = new SequentialPlanner(kernel);
51+
private static IConfigurationRoot BuildConfig(string[] args)
52+
{
53+
var config = new ConfigurationBuilder()
54+
.AddEnvironmentVariables()
55+
.AddUserSecrets<Program>()
56+
.AddCommandLine(args)
57+
.Build();
58+
return config;
59+
}
60+
61+
private static async Task TranscribeFileUsingPluginDirectly(IKernel kernel)
62+
{
63+
Console.WriteLine("Transcribing file using plugin directly");
64+
var variables = new ContextVariables
65+
{
66+
["audioUrl"] = "https://storage.googleapis.com/aai-docs-samples/espn.m4a",
67+
// ["filePath"] = "./espn.m4a" // you can also use `filePath` which will upload the file and override `audioUrl`
68+
};
69+
70+
var result = await kernel.Skills
71+
.GetFunction(TranscriptPlugin.PluginName, TranscriptPlugin.TranscribeFunctionName)
72+
.InvokeAsync(variables);
73+
74+
Console.WriteLine(result.Result);
75+
Console.WriteLine();
76+
}
77+
78+
private static async Task TranscribeFileUsingPluginFromSemanticFunction(IKernel kernel)
79+
{
80+
Console.WriteLine("Transcribing file and summarizing from within a semantic function");
81+
// This will pass the URL to the `INPUT` variable.
82+
// If `INPUT` is a URL, it'll use `INPUT` as `audioUrl`, otherwise, it'll use `INPUT` as `filePath`.
83+
const string prompt = """
84+
Here is a transcript:
85+
{{TranscriptPlugin.Transcribe "https://storage.googleapis.com/aai-docs-samples/espn.m4a"}}
86+
---
87+
Summarize the transcript.
88+
""";
89+
var context = kernel.CreateNewContext();
90+
var function = kernel.CreateSemanticFunction(prompt);
91+
await function.InvokeAsync(context);
92+
Console.WriteLine(context.Result);
93+
Console.WriteLine();
94+
}
95+
96+
private static async Task TranscribeFileUsingPlan(IKernel kernel)
97+
{
98+
Console.WriteLine("Transcribing file from a plan");
99+
var planner = new SequentialPlanner(kernel);
60100

61-
const string prompt = "Transcribe the espn.m4a in my downloads folder.";
62-
var plan = await planner.CreatePlanAsync(prompt);
101+
const string prompt = "Transcribe the espn.m4a in my downloads folder.";
102+
var plan = await planner.CreatePlanAsync(prompt);
63103

64-
Console.WriteLine("Plan:\n");
65-
Console.WriteLine(JsonSerializer.Serialize(plan, new JsonSerializerOptions { WriteIndented = true }));
104+
Console.WriteLine("Plan:\n");
105+
Console.WriteLine(JsonSerializer.Serialize(plan, new JsonSerializerOptions { WriteIndented = true }));
66106

67-
var transcript = (await kernel.RunAsync(plan)).Result;
68-
Console.WriteLine("Transcript:");
69-
Console.WriteLine(transcript);
107+
var transcript = (await kernel.RunAsync(plan)).Result;
108+
Console.WriteLine(transcript);
109+
Console.WriteLine();
110+
}
70111
}

0 commit comments

Comments
 (0)