- Documentation >
- AI Actions >
- Extend AI Actions
Extend AI Actions
By extending AI Actions, you can make regular content management and editing tasks more appealing and less demanding.
You can start by integrating additional AI services to the existing action types or develop custom ones that impact completely new areas of application.
For example, you can create a handler that connects to a translation model and use it to translate your website on-the-fly, or generate illustrations based on a body of an article.
Execute Actions
You can execute AI Actions by using the ActionServiceInterface service, as in the following example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 | $action = new GenerateAltTextAction(new Image([$imageEncodedInBase64]));
$action->setRuntimeContext(new RuntimeContext(['languageCode' => $languageCode]));
$action->setActionContext(
new ActionContext(
new ActionConfigurationOptions(['default_locale_fallback' => 'en']), // System context
new ActionConfigurationOptions(['max_lenght' => 100]), // Action Type options
new ActionConfigurationOptions( // Action Handler options
[
'prompt' => 'Generate the alt text for this image in less than 100 characters.',
'temperature' => 0.7,
'max_tokens' => 4096,
'model' => 'gpt-4o-mini',
]
)
)
);
$output = $this->actionService->execute($action)->getOutput();
|
The GenerateAltTextAction
is a built-in action that implements the ActionInterface, takes an Image as an input, and generates the alternative text in the response.
This action is parameterized with the RuntimeContext and the ActionContext, which allows you to pass additional options to the Action before it's executed.
Type of context |
Type of options |
Usage |
Example |
Runtime Context |
Runtime options |
Sets additional parameters that are relevant to the specific action that is currently executed |
Information about the language of the content that is being processed |
Action Context |
Action Type options |
Sets additional parameters for the Action Type |
Information about the expected response length |
Action Context |
Action Handler options |
Sets additional parameters for the Action Handler |
Information about the model, temperature, prompt, and max tokens allowed |
Action Context |
System options |
Sets additional information, not matching the other option collections |
Information about the fallback locale |
Both ActionContext
and RuntimeContext
are passed to the Action Handler (an object implementing the ActionHandlerInterface) to execute the action. The Action Handler is responsible for combining all the options together, sending them to the AI service and returning an ActionResponse.
You can pass the Action Handler directly to the ActionServiceInterface::execute()
method, which overrides all the other ways of selecting the Action Handler.
You can also specify the Action Handler by including it in the provided Action Configuration.
In other cases, the Action Handler is selected automatically.
You can affect this choice by creating your own class implementing the ActionHandlerResolverInterface or by listening to the ResolveActionHandlerEvent Event sent by the default implementation.
You can influence the execution of an Action with two events:
Below you can find the full example of a Symfony Command, together with a matching service definition.
The command finds the images modified in the last 24 hours, and adds the alternative text to them if it's missing.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158 | <?php
declare(strict_types=1);
namespace App\Command;
use Ibexa\Contracts\ConnectorAi\Action\ActionContext;
use Ibexa\Contracts\ConnectorAi\Action\DataType\Image;
use Ibexa\Contracts\ConnectorAi\Action\DataType\Text;
use Ibexa\Contracts\ConnectorAi\Action\GenerateAltTextAction;
use Ibexa\Contracts\ConnectorAi\Action\RuntimeContext;
use Ibexa\Contracts\ConnectorAi\ActionConfiguration\ActionConfigurationOptions;
use Ibexa\Contracts\ConnectorAi\ActionServiceInterface;
use Ibexa\Contracts\Core\Repository\ContentService;
use Ibexa\Contracts\Core\Repository\FieldTypeService;
use Ibexa\Contracts\Core\Repository\PermissionResolver;
use Ibexa\Contracts\Core\Repository\UserService;
use Ibexa\Contracts\Core\Repository\Values\Content\ContentList;
use Ibexa\Contracts\Core\Repository\Values\Content\Query\Criterion\ContentTypeIdentifier;
use Ibexa\Contracts\Core\Repository\Values\Content\Query\Criterion\DateMetadata;
use Ibexa\Contracts\Core\Repository\Values\Content\Query\Criterion\Operator;
use Ibexa\Contracts\Core\Repository\Values\Filter\Filter;
use Ibexa\Core\FieldType\Image\Value;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputArgument;
use Symfony\Component\Console\Input\InputInterface;
use Symfony\Component\Console\Output\OutputInterface;
final class AddMissingAltTextCommand extends Command
{
protected static $defaultName = 'app:add-alt-text';
private const IMAGE_FIELD_IDENTIFIER = 'image';
private ContentService $contentService;
private PermissionResolver $permissionResolver;
private UserService $userService;
private FieldTypeService $fieldTypeService;
private ActionServiceInterface $actionService;
private string $projectDir;
public function __construct(
ContentService $contentService,
PermissionResolver $permissionResolver,
UserService $userService,
FieldTypeService $fieldTypeService,
ActionServiceInterface $actionService,
string $projectDir
) {
parent::__construct();
$this->contentService = $contentService;
$this->permissionResolver = $permissionResolver;
$this->userService = $userService;
$this->fieldTypeService = $fieldTypeService;
$this->actionService = $actionService;
$this->projectDir = $projectDir;
}
protected function configure(): void
{
$this->addArgument('user', InputArgument::OPTIONAL, 'Login of the user executing the actions', 'admin');
}
protected function execute(InputInterface $input, OutputInterface $output): int
{
$this->setUser($input->getArgument('user'));
$modifiedImages = $this->getModifiedImages();
$output->writeln(sprintf('Found %d modified image in the last 24h', $modifiedImages->getTotalCount()));
/** @var \Ibexa\Contracts\Core\Repository\Values\Content\Content $content */
foreach ($modifiedImages as $content) {
/** @var ?Value $value */
$value = $content->getFieldValue(self::IMAGE_FIELD_IDENTIFIER);
if ($value === null || !$this->shouldGenerateAltText($value)) {
$output->writeln(sprintf('Image %s has the image field empty or the alternative text is already specified. Skipping.', $content->getName()));
continue;
}
$contentUpdateStruct = $this->contentService->newContentUpdateStruct();
$value->alternativeText = $this->getSuggestedAltText($this->convertImageToBase64($value->uri), $content->getDefaultLanguageCode());
$contentUpdateStruct->setField(self::IMAGE_FIELD_IDENTIFIER, $value);
$updatedContent = $this->contentService->updateContent(
$this->contentService->createContentDraft($content->getContentInfo())->getVersionInfo(),
$contentUpdateStruct
);
$this->contentService->publishVersion($updatedContent->getVersionInfo());
}
return Command::SUCCESS;
}
private function getSuggestedAltText(string $imageEncodedInBase64, string $languageCode): string
{
$action = new GenerateAltTextAction(new Image([$imageEncodedInBase64]));
$action->setRuntimeContext(new RuntimeContext(['languageCode' => $languageCode]));
$action->setActionContext(
new ActionContext(
new ActionConfigurationOptions(['default_locale_fallback' => 'en']), // System context
new ActionConfigurationOptions(['max_lenght' => 100]), // Action Type options
new ActionConfigurationOptions( // Action Handler options
[
'prompt' => 'Generate the alt text for this image in less than 100 characters.',
'temperature' => 0.7,
'max_tokens' => 4096,
'model' => 'gpt-4o-mini',
]
)
)
);
$output = $this->actionService->execute($action)->getOutput();
assert($output instanceof Text);
return $output->getText();
}
private function convertImageToBase64(?string $uri): string
{
$file = file_get_contents($this->projectDir . \DIRECTORY_SEPARATOR . 'public' . \DIRECTORY_SEPARATOR . $uri);
if ($file === false) {
throw new \RuntimeException('Cannot read file');
}
return 'data:image/jpeg;base64,' . base64_encode($file);
}
private function getModifiedImages(): ContentList
{
$filter = (new Filter())
->withCriterion(
new DateMetadata(DateMetadata::MODIFIED, Operator::GTE, strtotime('-1 day'))
)
->andWithCriterion(new ContentTypeIdentifier('image'));
return $this->contentService->find($filter);
}
private function shouldGenerateAltText(Value $value): bool
{
return $this->fieldTypeService->getFieldType('ezimage')->isEmptyValue($value) === false &&
$value->isAlternativeTextEmpty();
}
private function setUser(string $userLogin): void
{
$this->permissionResolver->setCurrentUserReference($this->userService->loadUserByLogin($userLogin));
}
}
|
| App\Command\AddMissingAltTextCommand:
arguments:
$projectDir: '%kernel.project_dir%'
|
Executing Actions this way has a major drawback: all the parameters are stored directly in the code and cannot be easily reused or changed.
To manage configurations of an AI Action you need to use another concept: Action Configurations.
Action Configurations
Manage Action Configurations
Action Configurations allow you to store the parameters for a given Action in the database and reuse them when needed.
They can be managed through the back office, data migrations, or through the PHP API.
To manage Action Configurations through the PHP API, you need to use the ActionConfigurationServiceInterface service.
You can manage them using the following methods:
See the AI Actions event reference for a list of events related to these operations.
You can get a specific Action Configuration using the ActionConfigurationServiceInterface::getActionConfiguration()
method and search for them using the ActionConfigurationServiceInterface::findActionConfigurations()
method.
See Action Configuration Search Criteria reference and Action Configuration Search Sort Clauses reference to discover query possibilities.
The following example creates a new Action Configuration:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 | $refineTextActionType = $this->actionTypeRegistry->getActionType('refine_text');
$actionConfigurationCreateStruct = new ActionConfigurationCreateStruct('rewrite_casual');
$actionConfigurationCreateStruct->setType($refineTextActionType);
$actionConfigurationCreateStruct->setName('eng-GB', 'Rewrite in casual tone');
$actionConfigurationCreateStruct->setDescription('eng-GB', 'Rewrites the text using a casual tone');
$actionConfigurationCreateStruct->setActionHandler('openai-text-to-text');
$actionConfigurationCreateStruct->setActionHandlerOptions(new ArrayMap([
'max_tokens' => 4000,
'temperature' => 1,
'prompt' => 'Rewrite this content to improve readability. Preserve meaning and crucial information but use casual language accessible to a broader audience.',
'model' => 'gpt-4-turbo',
]));
$actionConfigurationCreateStruct->setEnabled(true);
$this->actionConfigurationService->createActionConfiguration($actionConfigurationCreateStruct);
|
Actions Configurations are tied to a specific Action Type and are translatable.
Execute Actions with Action Configurations
Reuse existing Action Configurations to simplify the execution of AI Actions.
You can pass one directly to the ActionServiceInterface::execute()
method:
| $action = new RefineTextAction(new Text([
<<<TEXT
Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes,
and which usually results in protein folding into a specific 3D structure that determines its activity.
TEXT
]));
$actionConfiguration = $this->actionConfigurationService->getActionConfiguration('rewrite_casual');
$actionResponse = $this->actionService->execute($action, $actionConfiguration)->getOutput();
|
The passed Action Configuration is only taken into account if the Action Context was not passed to the Action directly using the ActionInterface::setActionContext() method.
The ActionServiceInterface
service extracts the configuration options from the Action Configuration object and builds the Action Context object internally:
- Action Type options are mapped to Action Type options in the Action Context
- Action Handler options are mapped to Action Handler options in the Action Context
- System Context options are modified using the ContextEvent event
Create custom Action Handler
Ibexa DXP comes with a built-in connector to OpenAI services, but you're not limited to it and can add support for additional AI services in your application.
The following example adds a new Action Handler connecting to a local AI run using the llamafile project which you can use to execute Text-To-Text Actions, such as the built-in "Refine Text" Action.
Register a custom Action Handler in the system.
Create a class implementing the ActionHandlerInterface and register it as a service:
- The
ActionHandlerInterface::supports()
method decides whether the Action Handler is able to execute given Action.
- The
ActionHandlerInterface::handle()
method is responsible for combining all the Action options together, sending them to the AI service and forming an Action Response.
- The
ActionHandlerInterface::getIdentifier()
method returns the identifier of the Action Handler which you can use to refer to it in other places in the code.
See the code sample below, together with a matching service definition:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80 | <?php
declare(strict_types=1);
namespace App\AI\Handler;
use Ibexa\Contracts\ConnectorAi\Action\ActionHandlerInterface;
use Ibexa\Contracts\ConnectorAi\Action\DataType\Text;
use Ibexa\Contracts\ConnectorAi\Action\Response\TextResponse;
use Ibexa\Contracts\ConnectorAi\Action\TextToText\Action as TextToTextAction;
use Ibexa\Contracts\ConnectorAi\ActionInterface;
use Ibexa\Contracts\ConnectorAi\ActionResponseInterface;
use Symfony\Contracts\HttpClient\HttpClientInterface;
final class LLaVaTextToTextActionHandler implements ActionHandlerInterface
{
private HttpClientInterface $client;
private string $host;
public const IDENTIFIER = 'LLaVATextToText';
public function __construct(HttpClientInterface $client, string $host = 'http://localhost:8080')
{
$this->client = $client;
$this->host = $host;
}
public function supports(ActionInterface $action): bool
{
return $action instanceof TextToTextAction;
}
public function handle(ActionInterface $action, array $context = []): ActionResponseInterface
{
/** @var \Ibexa\Contracts\ConnectorAi\Action\DataType\Text */
$input = $action->getInput();
$text = $this->sanitizeInput($input->getText());
$systemMessage = $action->hasActionContext() ? $action->getActionContext()->getActionHandlerOptions()->get('system_prompt', '') : '';
$response = $this->client->request(
'POST',
sprintf('%s/v1/chat/completions', $this->host),
[
'headers' => [
'Authorization: Bearer no-key',
],
'json' => [
'model' => 'LLaMA_CPP',
'messages' => [
(object)[
'role' => 'system',
'content' => $systemMessage,
],
(object)[
'role' => 'user',
'content' => $text,
],
],
'temperature' => 0.7,
],
]
);
$output = strip_tags(json_decode($response->getContent(), true)['choices'][0]['message']['content']);
return new TextResponse(new Text([$output]));
}
public static function getIdentifier(): string
{
return self::IDENTIFIER;
}
private function sanitizeInput(string $text): string
{
return str_replace(["\n", "\r"], ' ', $text);
}
}
|
| App\AI\Handler\LLaVATextToTextActionHandler:
tags:
- { name: ibexa.ai.action.handler, priority: 0 }
- { name: ibexa.ai.action.handler.text_to_text, priority: 0 }
|
The ibexa.ai.action.handler
tag is used by the ActionHandlerResolverInterface
to find all the Action Handlers in the system.
The built-in Action Types use service tags to find Action Handlers capable of handling them and display in the back office UI:
- Refine Text uses the
ibexa.ai.action.handler.text_to_text
service tag
- Generate Alt Text uses the
ibexa.ai.action.handler.image_to_text
service tag
Form configuration makes the Handler configurable by using the back office.
The example handler uses the system_prompt
option, which becomes part of the Action Configuration UI thanks to the following code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32 | <?php
declare(strict_types=1);
namespace App\Form\Type;
use Symfony\Component\Form\AbstractType;
use Symfony\Component\Form\Extension\Core\Type\TextareaType;
use Symfony\Component\Form\FormBuilderInterface;
use Symfony\Component\OptionsResolver\OptionsResolver;
final class TextToTextOptionsType extends AbstractType
{
public function buildForm(FormBuilderInterface $builder, array $options): void
{
$builder->add('system_prompt', TextareaType::class, [
'required' => true,
'disabled' => $options['translation_mode'],
'label' => 'System message',
]);
}
public function configureOptions(OptionsResolver $resolver): void
{
$resolver->setDefaults([
'translation_domain' => 'app_ai',
'translation_mode' => false,
]);
$resolver->setAllowedTypes('translation_mode', 'bool');
}
}
|
| app.connector_ai.action_configuration.handler.llava_text_to_text.form_mapper.options:
class: Ibexa\Bundle\ConnectorAi\Form\FormMapper\ActionConfiguration\ActionHandlerOptionsFormMapper
arguments:
$formType: 'App\Form\Type\TextToTextOptionsType'
tags:
- name: ibexa.connector_ai.action_configuration.form_mapper.options
type: !php/const \App\AI\Handler\LLaVaTextToTextActionHandler::IDENTIFIER
|
The created Form Type adds the system_prompt
field to the Form.
Use the Ibexa\Bundle\ConnectorAi\Form\FormMapper\ActionConfiguration\ActionHandlerOptionsFormMapper
class together with the ibexa.connector_ai.action_configuration.form_mapper.options
service tag to make it part of the Action Handler options form.
Pass the Action Handler identifier (LLaVATextToText
) as the type when tagging the service.
The Action Handler and Action Type options are rendered in the back office using the built-in Twig option formatter.
You can create your own formatting by creating a class implementing the OptionsFormatterInterface interface and aliasing it to Ibexa\Contracts\ConnectorAi\ActionConfiguration\OptionsFormatterInterface
.
The following service definition switches the options rendering to the other built-in option formatter, displaying the options as JSON.
| Ibexa\Contracts\ConnectorAi\ActionConfiguration\OptionsFormatterInterface:
alias: Ibexa\ConnectorAi\ActionConfiguration\JsonOptionsFormatter
|
Custom Action Type use case
With custom Action Types you can create your own tasks for the AI services to perform.
They can be integrated with the rest of the AI framework provided by Ibexa and incorporated into the back office.
The following example shows how to implement a custom Action Type dedicated for transcribing audio with an example Handler using the OpenAI's Whisper project.
Create custom Action Type
Start by creating your own Action Type, a class implementing the ActionTypeInterface.
The class needs to define following parameters of the Action Type:
- name
- identifier
- input type identifier
- output type identifier
- Action object
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69 | <?php
declare(strict_types=1);
namespace App\AI\ActionType;
use App\AI\Action\TranscribeAudioAction;
use App\AI\DataType\Audio;
use Ibexa\Contracts\ConnectorAi\Action\DataType\Text;
use Ibexa\Contracts\ConnectorAi\ActionInterface;
use Ibexa\Contracts\ConnectorAi\ActionType\ActionTypeInterface;
use Ibexa\Contracts\ConnectorAi\DataType;
use Ibexa\Contracts\Core\Exception\InvalidArgumentException;
final class TranscribeAudioActionType implements ActionTypeInterface
{
public const IDENTIFIER = 'transcribe_audio';
/** @var iterable<\Ibexa\Contracts\ConnectorAi\Action\ActionHandlerInterface> */
private iterable $actionHandlers;
/** @param iterable<\Ibexa\Contracts\ConnectorAi\Action\ActionHandlerInterface> $actionHandlers*/
public function __construct(iterable $actionHandlers)
{
$this->actionHandlers = $actionHandlers;
}
public function getIdentifier(): string
{
return self::IDENTIFIER;
}
public function getName(): string
{
return 'Transcribe audio';
}
public function getInputIdentifier(): string
{
return Audio::getIdentifier();
}
public function getOutputIdentifier(): string
{
return Text::getIdentifier();
}
public function getOptions(): array
{
return [];
}
public function createAction(DataType $input, array $parameters = []): ActionInterface
{
if (!$input instanceof Audio) {
throw new InvalidArgumentException(
'audio',
'expected \App\AI\DataType\Audio type, ' . get_debug_type($input) . ' given.'
);
}
return new TranscribeAudioAction($input);
}
public function getActionHandlers(): iterable
{
return $this->actionHandlers;
}
}
|
| App\AI\ActionType\TranscribeAudioActionType:
arguments:
$actionHandlers: !tagged_iterator
tag: app.connector_ai.action.handler.audio_to_text
default_index_method: getIdentifier
index_by: key
tags:
- { name: ibexa.ai.action.type, identifier: !php/const \App\AI\ActionType\TranscribeAudioActionType::IDENTIFIER }
|
The service definition introduces a custom app.connector_ai.action.handler.audio_to_text
service tag to mark all the handlers capable of working with this Action Type.
The ibexa.ai.action.type
service tag registers the class in the service container as a new Action Type.
If the Action Type is meant to be used mainly with prompt-based systems you can use the LLMBaseActionTypeInterface interface as the base for your Action Type.
It allows you to define a base prompt directly in the Action Type that can be common for all Action Configurations.
Action Type names can be localized using the Translation component.
See the built-in Action Types like Generate Alt Text or Refine Text for an example.
Create custom Data classes
The TranscribeAudio
Action Type requires adding two data classes that exists in its definition:
- an
Audio
class, implementing the DataType interface, to store the input data for the Action
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39 | <?php
declare(strict_types=1);
namespace App\AI\DataType;
use Ibexa\Contracts\ConnectorAi\DataType;
/**
* @implements DataType<string>
*/
final class Audio implements DataType
{
/** @var non-empty-array<string> */
private array $base64;
/**
* @param non-empty-array<string> $base64
*/
public function __construct(array $base64)
{
$this->base64 = $base64;
}
public function getBase64(): string
{
return reset($this->base64);
}
public function getList(): array
{
return $this->base64;
}
public static function getIdentifier(): string
{
return 'audio';
}
}
|
- an
TranscribeAudioAction
class, implementing the ActionInterface interface. Pass this object to the ActionServiceInterface::execute()
method to execute the action.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33 | <?php
declare(strict_types=1);
namespace App\AI\Action;
use App\AI\DataType\Audio;
use Ibexa\Contracts\ConnectorAi\Action\Action;
final class TranscribeAudioAction extends Action
{
private Audio $audio;
public function __construct(Audio $audio)
{
$this->audio = $audio;
}
public function getParameters(): array
{
return [];
}
public function getInput(): Audio
{
return $this->audio;
}
public function getActionTypeIdentifier(): string
{
return 'transcribe_audio';
}
}
|
Custom Form Type is needed if the Action Type requires additional options configurable in the UI.
The following example adds a checkbox field that indicates to the Action Handler whether the transcription should include the timestamps.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32 | <?php
declare(strict_types=1);
namespace App\Form\Type;
use Symfony\Component\Form\AbstractType;
use Symfony\Component\Form\Extension\Core\Type\CheckboxType;
use Symfony\Component\Form\FormBuilderInterface;
use Symfony\Component\OptionsResolver\OptionsResolver;
final class TranscribeAudioOptionsType extends AbstractType
{
public function buildForm(FormBuilderInterface $builder, array $options): void
{
$builder->add('include_timestamps', CheckboxType::class, [
'required' => false,
'disabled' => $options['translation_mode'],
'label' => 'Include timestamps',
]);
}
public function configureOptions(OptionsResolver $resolver): void
{
$resolver->setDefaults([
'translation_domain' => 'app_ai',
'translation_mode' => false,
]);
$resolver->setAllowedTypes('translation_mode', 'bool');
}
}
|
| app.connector_ai.action_configuration.handler.transcribe_audio.form_mapper.options:
class: Ibexa\Bundle\ConnectorAi\Form\FormMapper\ActionConfiguration\ActionTypeOptionsFormMapper
arguments:
$formType: 'App\Form\Type\TranscribeAudioOptionsType'
tags:
- name: ibexa.connector_ai.action_configuration.form_mapper.action_type_options
type: !php/const \App\AI\ActionType\TranscribeAudioActionType::IDENTIFIER
|
The built-in Ibexa\Bundle\ConnectorAi\Form\FormMapper\ActionConfiguration\ActionTypeOptionsFormMapper
renders the Form Type in the back office when editing the Action Configuration for a specific Action Type (indicated by the type
attribute of the ibexa.connector_ai.action_configuration.form_mapper.action_type_options
service tag).
Create custom Action Handler
An example Action Handler combines the input data and the Action Type options and passes them to the Whisper executable to form an Action Response.
The language of the transcribed data is extracted from the Runtime Context for better results.
The Action Type options provided in the Action Context dictate whether the timestamps will be removed before returning the result.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90 | <?php
declare(strict_types=1);
namespace App\AI\Handler;
use App\AI\ActionType\TranscribeAudioActionType;
use Ibexa\Contracts\ConnectorAi\Action\ActionHandlerInterface;
use Ibexa\Contracts\ConnectorAi\Action\DataType\Text;
use Ibexa\Contracts\ConnectorAi\Action\Response\TextResponse;
use Ibexa\Contracts\ConnectorAi\ActionInterface;
use Ibexa\Contracts\ConnectorAi\ActionResponseInterface;
use Symfony\Component\Process\Exception\ProcessFailedException;
use Symfony\Component\Process\Process;
final class WhisperAudioToTextActionHandler implements ActionHandlerInterface
{
private const TIMESTAMP_FORMAT = '/^\[\d{2}:\d{2}\.\d{3} --> \d{2}:\d{2}\.\d{3}]\s*/';
public function supports(ActionInterface $action): bool
{
return $action->getActionTypeIdentifier() === TranscribeAudioActionType::IDENTIFIER;
}
public function handle(ActionInterface $action, array $context = []): ActionResponseInterface
{
/** @var \App\AI\DataType\Audio $input */
$input = $action->getInput();
$path = $this->saveInputToFile($input->getBase64());
$arguments = ['whisper'];
$language = $action->getRuntimeContext()?->get('languageCode');
if ($language !== null) {
$arguments[] = sprintf('--language=%s', substr($language, 0, 2));
}
$arguments[] = '--output_format=txt';
$arguments[] = $path;
$process = new Process($arguments);
$process->run();
if (!$process->isSuccessful()) {
unlink($path);
throw new ProcessFailedException($process);
}
$output = $process->getOutput();
$includeTimestamps = $action->getActionContext()
?->getActionTypeOptions()
->get('include_timestamps', false)
?? false;
if (!$includeTimestamps) {
$output = $this->removeTimestamps($output);
}
unlink($path);
return new TextResponse(new Text([$output]));
}
public static function getIdentifier(): string
{
return 'whisper_audio_to_text';
}
private function removeTimestamps(string $text): string
{
$lines = explode(PHP_EOL, $text);
$processedLines = array_map(static function (string $line): string {
return preg_replace(self::TIMESTAMP_FORMAT, '', $line) ?? '';
}, $lines);
return implode(PHP_EOL, $processedLines);
}
private function saveInputToFile(string $audioEncodedInBase64): string
{
$filename = uniqid('audio');
$path = sys_get_temp_dir() . \DIRECTORY_SEPARATOR . $filename;
file_put_contents($path, base64_decode($audioEncodedInBase64));
return $path;
}
}
|
| App\AI\Handler\WhisperAudioToTextActionHandler:
tags:
- { name: ibexa.ai.action.handler, priority: 0 }
- { name: app.connector_ai.action.handler.audio_to_text, priority: 0 }
|
Integrate with the REST API
At this point the custom Action Type can already be executed by using the PHP API.
To integrate it with the AI Actions execute endpoint you need to create additional classes responsible for parsing the request and response data.
See adding custom media type and creating new REST resource to learn more about extending the REST API.
Start by creating an Input Parser able to handle the application/vnd.ibexa.api.ai.TranscribeAudio
media type.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52 | <?php
declare(strict_types=1);
namespace App\AI\REST\Input\Parser;
use App\AI\DataType\Audio as AudioDataType;
use App\AI\REST\Value\TranscribeAudioAction;
use Ibexa\ConnectorAi\REST\Input\Parser\Action;
use Ibexa\Contracts\ConnectorAi\Action\RuntimeContext;
use Ibexa\Contracts\Rest\Input\ParsingDispatcher;
use Ibexa\Rest\Input\BaseParser;
final class TranscribeAudio extends BaseParser
{
public const AUDIO_KEY = 'Audio';
public const BASE64_KEY = 'base64';
/** @param array<mixed> $data */
public function parse(array $data, ParsingDispatcher $parsingDispatcher): TranscribeAudioAction
{
$this->assertInputIsValid($data);
$runtimeContext = $this->getRuntimeContext($data);
return new TranscribeAudioAction(
new AudioDataType([$data[self::AUDIO_KEY][self::BASE64_KEY]]),
$runtimeContext
);
}
/** @param array<mixed> $data */
private function assertInputIsValid(array $data): void
{
if (!array_key_exists(self::AUDIO_KEY, $data)) {
throw new \InvalidArgumentException('Missing audio key');
}
if (!array_key_exists(self::BASE64_KEY, $data[self::AUDIO_KEY])) {
throw new \InvalidArgumentException('Missing base64 key');
}
}
/**
* @param array<string, mixed> $data
*/
private function getRuntimeContext(array $data): RuntimeContext
{
return new RuntimeContext(
$data[Action::RUNTIME_CONTEXT_KEY] ?? []
);
}
}
|
| App\AI\REST\Input\Parser\TranscribeAudio:
parent: Ibexa\Rest\Server\Common\Parser
tags:
- { name: ibexa.rest.input.parser, mediaType: application/vnd.ibexa.api.ai.TranscribeAudio }
|
The TranscribeAudioAction
is a value object holding the parsed request data.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33 | <?php
declare(strict_types=1);
namespace App\AI\REST\Value;
use App\AI\DataType\Audio;
use Ibexa\Contracts\ConnectorAi\Action\RuntimeContext;
final class TranscribeAudioAction
{
private Audio $input;
private RuntimeContext $runtimeContext;
public function __construct(
Audio $input,
RuntimeContext $runtimeContext
) {
$this->input = $input;
$this->runtimeContext = $runtimeContext;
}
public function getInput(): Audio
{
return $this->input;
}
public function getRuntimeContext(): RuntimeContext
{
return $this->runtimeContext;
}
}
|
Handle output data
To transform the TranscribeAudioAction
into a REST response you need to create:
- An
AudioText
value object holding the REST response data
| <?php
declare(strict_types=1);
namespace App\AI\REST\Value;
use Ibexa\ConnectorAi\REST\Value\RestActionResponse;
final class AudioText extends RestActionResponse
{
}
|
- A resolver converting the Action Response returned from the PHP API layer into the
AudioText
object.
The resolver is activated when application/vnd.ibexa.api.ai.AudioText
media type is specified in the Accept
header:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20 | <?php
declare(strict_types=1);
namespace App\AI\REST\Output\Resolver;
use App\AI\REST\Value\AudioText;
use Ibexa\ConnectorAi\REST\Output\ResolverInterface;
use Ibexa\Contracts\ConnectorAi\ActionResponseInterface;
final class AudioTextResolver implements ResolverInterface
{
public function getRestValue(
ActionResponseInterface $actionResponse
): AudioText {
return new AudioText(
$actionResponse->getOutput()
);
}
}
|
| App\AI\REST\Output\Resolver\AudioTextResolver:
tags:
- { name: ibexa.ai.action.mime_type, key: application/vnd.ibexa.api.ai.AudioText }
|
- A visitor converting the response value object into a serialized REST response:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 | <?php
declare(strict_types=1);
namespace App\AI\REST\Output\ValueObjectVisitor;
use Ibexa\Contracts\Rest\Output\Generator;
use Ibexa\Contracts\Rest\Output\ValueObjectVisitor;
use Ibexa\Contracts\Rest\Output\Visitor;
final class AudioText extends ValueObjectVisitor
{
private const OBJECT_IDENTIFIER = 'AudioText';
public function visit(Visitor $visitor, Generator $generator, $data): void
{
$mediaType = 'ai.' . self::OBJECT_IDENTIFIER;
$text = $data->getOutput();
$generator->startObjectElement(self::OBJECT_IDENTIFIER, $mediaType);
$visitor->setHeader('Content-Type', $generator->getMediaType($mediaType));
$visitor->visitValueObject($text);
$generator->endObjectElement(self::OBJECT_IDENTIFIER);
}
}
|
| App\AI\REST\Output\ValueObjectVisitor\AudioText:
parent: Ibexa\Contracts\Rest\Output\ValueObjectVisitor
tags:
- { name: ibexa.rest.output.value_object.visitor, type: App\AI\REST\Value\AudioText }
|
You can now execute a specific Action Configuration for the new custom Action Type through REST API by sending the following request:
| POST /ai/action/execute/my_action_configuration HTTP/1.1
Accept: application/vnd.ibexa.api.ai.AudioText+json
Content-Type: application/vnd.ibexa.api.ai.TranscribeAudio+json
|
| {
"TranscribeAudio": {
"Audio": {
"base64": "audioEncodedInBase64"
},
"RuntimeContext": {
"languageCode": "eng-GB"
}
}
}
|
Integrate into the back office
The last step in fully integrating the Transcribe Audio Action Type embeds it directly into the back office, allowing Editors to invoke it while doing their daily work.
Extend the default editing template of the ezbinaryfile
fieldtype by creating a new file called templates/themes/admin/admin/ui/fieldtype/edit/form_fields_binary_ai.html.twig
.
This template embeds the AI component, but only if a dedicated transcript
field (of eztext
type) is available in the same content type to store the content of the transcription.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25 | {% extends '@ibexadesign/ui/field_type/edit/ezbinaryfile.html.twig' %}
{% block ezbinaryfile_preview %}
{{ parent() }}
{% set transcriptFieldIdentifier = 'transcript' %}
{% set fieldTypeIdentifiers = form.parent.parent.vars.value|keys %}
{% if transcriptFieldIdentifier in fieldTypeIdentifiers %}
{% set module_id = 'TranscribeAudio' %}
{% set ai_config_id = 'transcribe_audio' %}
{% set container_selector = '.ibexa-edit-content' %}
{% set input_selector = '.ibexa-field-edit-preview__action--preview' %}
{% set output_selector = '#ezplatform_content_forms_content_edit_fieldsData_transcript_value' %}
{% set cancel_wrapper_selector = '.ibexa-field-edit-preview__media-wrapper' %}
{% embed '@ibexadesign/connector_ai/ui/ai_module/ai_component.html.twig' with {
ai_config_id,
container_selector,
input_selector,
output_selector,
} %}
{% endembed %}
{% endif %}
{% endblock %}
|
And add it to the SiteAccess configuration for the admin_group
:
| ibexa:
system:
admin_group:
admin_ui_forms:
content_edit:
form_templates:
- { template: '@ibexadesign/admin/ui/fieldtype/edit/form_fields_binary_ai.html.twig', priority: -10 } }
|
The configuration of the AI component takes the following parameters:
module_id
- name of the JavaScript module to handle the invoked action. ImgToText
is a built-in one handling alternative text use case, TranscribeAudio
is a custom one.
ai_config_id
- identifier of the Action Type to load Action Configurations for. The ibexa_ai_config Twig function is used under the hood.
container_selector
- CSS selector to narrow down the HTML area which is affected by the AI component.
input_selector
- CSS selector indicating the input field (must be below the container_selector
in the HTML structure).
output_selector
- CSS selector indicating the output field (must be below the container_selector
in the HTML structure).
cancel_wrapper_selector
- CSS selector indicating the element to which the "Cancel AI" UI element is attached.
Now create the JavaScript module mentioned in the template that is responsible for:
- gathering the input data (downloading the attached binary file and converting it into base64)
- executing the Action Configuration chosen by the editor through the REST API
- attaching the response to the output field
You can find the code of the module below. Place it in a file called assets/js/transcribe.audio.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67 | import BaseAIComponent from '../../vendor/ibexa/connector-ai/src/bundle/Resources/public/js/core/base.ai.component';
export default class TranscribeAudio extends BaseAIComponent {
constructor(mainElement, config) {
super(mainElement, config);
this.requestHeaders = {
Accept: 'application/vnd.ibexa.api.ai.AudioText+json',
'Content-Type': 'application/vnd.ibexa.api.ai.TranscribeAudio+json',
};
}
getAudioInBase64() {
const request = new XMLHttpRequest();
request.open('GET', this.inputElement.href, false);
request.overrideMimeType('text/plain; charset=x-user-defined');
request.send();
if (request.status === 200) {
return this.convertToBase64(request.responseText);
}
else {
this.processError('Error occured when decoding the file.');
}
}
getRequestBody() {
const body = {
TranscribeAudio: {
Audio: {
base64: this.getAudioInBase64(),
},
RuntimeContext: {},
},
};
if (this.languageCode) {
body.TranscribeAudio.RuntimeContext.languageCode = this.languageCode;
}
return JSON.stringify(body);
}
afterFetchData(response) {
super.afterFetchData();
if (response) {
this.outputElement.value = response.AudioText.Text.text[0];
}
}
toggle(forceEnabled) {
super.toggle(forceEnabled);
this.outputElement.disabled = !forceEnabled || !this.outputElement.disabled;
}
convertToBase64(data) {
let binary = '';
for (let i = 0; i < data.length; i++) {
binary += String.fromCharCode(data.charCodeAt(i) & 0xff);
}
return btoa(binary);
}
}
|
The last step is adding the module to the list of AI modules in the system, by using the provided addModule
function.
Create a file called assets/js/addAudioModule.js
:
| import { addModule } from '../../vendor/ibexa/connector-ai/src/bundle/Resources/public/js/core/create.ai.module';
import TranscribeAudio from './transcribe.audio';
addModule(TranscribeAudio);
|
And include it into the back office using Webpack Encore.
See configuring assets from main project files to learn more about this mechanism.
| const ibexaConfigManager = require('./ibexa.webpack.config.manager.js');
ibexaConfigManager.add({
ibexaConfig,
entryName: 'ibexa-admin-ui-layout-js',
newItems: [
path.resolve(__dirname, './assets/js/addAudioModule.js')
],
});
|
Your custom Action Type is now fully integrated into the back office UI and can be used by the Editors.