Introduction To Spring Boot AI
The world of artificial intelligence is rapidly evolving, and integrating AI capabilities into existing applications has become a crucial skill for modern developers. The Spring Boot AI framework, an extension of the popular Spring Boot ecosystem, aims to simplify this process for Java developers. In this post, we’ll explore what Spring Boot AI offers and how it can enhance your applications.
What is Spring Boot AI?
Spring Boot AI is a framework that provides a set of abstractions and integrations for working with various AI and machine learning services. It’s designed to make it easy for developers to add AI capabilities to their Spring Boot applications without having to deal with the complexities of different AI service APIs.
Key Features:
- Portable API: Interact with different LLMs using a consistent API, reducing development effort and promoting code reusability.
- Model Integration: Seamlessly connect to popular LLM providers like OpenAI, Microsoft Azure, Amazon SageMaker, Google Cloud, and Hugging Face.
- Easy Configuration: Like other Spring Boot starters, Spring Boot AI can be easily added to your project and configured via application properties.
- Prompt Engineering: The framework provides utilities for constructing and managing prompts, which are crucial for effective communication with language models.
- Output Parsing: Easily convert LLM responses into structured data formats like POJOs for further processing.
- Vector Database Integration: Leverage vector databases to store and retrieve embeddings for advanced AI applications.
- Function Calling: Enable LLMs to interact with external systems and data sources through function calls.
Using the Spring Boot AI Framework
Let’s take a look at a simple example that demonstrates how to use the Spring Boot AI framework to interact with an LLM. In this example we will use Ollama and Open source LLM models to generate text based on a prompt.
Note
You can follow the guide from here to setup Ollama and start using AI LLM models.
Go to start.sprin.io to create new spring project and add following dependencies
- Spring Web
- Ollama AI
Open the project in the IDE
Configure your Ollama AI service properties in your application.properties:
spring.ai.ollama.base-url=http://localhost:11434<br>spring.ai.ollama.chat.options.model= llama3:8b
Code language: Java (java)
Here’s a simple example of how to use Spring Boot AI to interact with an Ollama AI models
@RestController<br>@RequestMapping("/chatai")<br>public class ChatAIController {<br><br><br> private final ChatClient chatClient;<br><br> public ChatAIController(ChatClient.Builder chatClientBuilder) {<br> this.chatClient = chatClientBuilder.build();<br> }<br><br> @GetMapping("/movieRecommendation")<br> public String getMovieRecommendations(@RequestParam(value = "genre") String genre) {<br> String userPrompt = "Recommend me a movie in %s genre".formatted(genre);<br> ChatResponse response =<br> chatClient.prompt(new Prompt(userPrompt)).call().chatResponse();<br> return response.getResult().getOutput().getContent();<br> }<br>}
Code language: Java (java)
ChatClient
The ChatClient
offers API for communicating with an AI Model. It supports both a synchronous and reactive programming model. It is the standard interface for interacting with different AI systems like OpenAI, Ollama, Google Cloud, Azure Open AI.
The AI model handles two primary types of messages: user messages, which are direct inputs from the user, and system messages, which are used by the system to steer the conversation.
These messages typically include placeholders that are dynamically replaced at runtime with user input, allowing the AI model to tailor its responses to the user’s specific input.
Now let’s test the application for movie recommendation.
GET http://localhost:8080/chatai/movieRecommendation?genre="action"
HTTP/1.1 200
Content-Type: text/plain;charset=UTF-8
Content-Length: 1221
Date: Sun, 25 Aug 2024 07:07:15 GMT
Here's a recommendation:
**John Wick: Chapter 3 - Parabellum (2019)**
Starring Keanu Reeves, Halle Berry, and Ian McShane, this third installment in the John Wick franchise is an adrenaline-fueled thrill ride. The movie takes place immediately after the events of the second film, with John Wick on the run from a global network of hitmen.
The action scenes are incredibly well-choreographed and visually stunning, with plenty of guns, swords, and martial arts combat to keep you on the edge of your seat. The movie also has a great storyline, with twists and turns that will keep you guessing until the very end.
Other notable action movies:
* **Atomic Blonde (2017)**: A stylish and intense spy thriller set in 1980s Berlin.
* **The Raid: Redemption (2011)**: An Indonesian martial arts film known for its intense and unrelenting action sequences.
* **Mad Max: Fury Road (2015)**: A post-apocalyptic action movie with stunning stunts, impressive vehicles, and a strong female lead.
* **Bourne Ultimatum (2007)**: The third installment in the Jason Bourne franchise, featuring high-stakes action and thrilling chase sequences.
Let me know if you have any specific preferences or if you'd like more recommendations!
The above example shows the LLM response base on the user prompt.
Now let’s look at the LLM’s response with System prompt for the same movie user prompt.
public class ChatAIController {<br><br>...<br><br> @GetMapping("/movieRecommendationWithSystemPrompt")<br> public String getMovieRecommendationsWithSystemPrompt(@RequestParam(value = "genre") String genre) {<br> String userPrompt = "Recommend me a movie in %s genre".formatted(genre);<br> String systemPrompt = "You are a movie recommender, helping users discover new films " +<br> "based on their preferences, moods, and interests. Offer personalized " +<br> "recommendations, provide insights into the movies' plots, themes, and key " +<br> "features, and suggest similar films that users may enjoy. Help users find their " +<br> "next favorite movie experience.";<br> ChatResponse response =<br> chatClient.prompt().user(userPrompt).system(systemPrompt).call().chatResponse();<br> return response.getResult().getOutput().getContent();<br> }<br><br>}
Code language: Java (java)
Following is the response from the LLM based on the given system prompt. You can observe that based on the system prompt response has changed.
GET http://localhost:8080/chatai/movieRecommendationWithSystemPrompt?genre="action"
HTTP/1.1 200
Content-Type: text/plain;charset=UTF-8
Content-Length: 1235
Date: Sun, 25 Aug 2024 06:53:35 GMT
I'd be happy to recommend an action-packed movie for you!
Based on my analysis of popular action movies, I think you might enjoy:
**Movie Recommendation:** John Wick (2014)
**Plot:** Retired hitman John Wick seeks vengeance against a powerful crime lord and his associates after they murder his dog, a gift from his deceased wife. As he delves deeper into the world of organized crime, John Wick must use his exceptional combat skills to take down his enemies and clear his name.
**Key Features:**
* Gritty, stylish visuals with impressive action sequences
* A unique blend of gunplay, hand-to-hand combat, and creative fight choreography
* A memorable performance by Keanu Reeves as the stoic and deadly John Wick
**Themes:** Redemption, Revenge, Loyalty, and the consequences of getting back into the "game"
If you enjoy John Wick, you might also like:
1. Atomic Blonde (2017) - a stylish and intense spy thriller with impressive fight sequences
2. The Accountant (2016) - a dark and action-packed story about a socially awkward hitman with exceptional math skills
3. Taken (2008) - an adrenaline-fueled, fast-paced action movie with Liam Neeson as a former CIA operative
So, are you ready to take on the world of John Wick?
Returning an Entity
In above examples we have retuned LLM’s response as string. In real world application we may want to return as entity to display on browser. entity method provides this functionality.
For example, given the Java record:
public record ActorMovies(String actor, List<String> movies) {<br>}<br>
Code language: Java (java)
You can map the AI model’s output to this record using the entity
method, as shown below:
@GetMapping("/moviesRecommendationOfActor")<br> public ActorMovies getMoviesRecommendationOfActor(@RequestParam(value = "actor") String actor) {<br> String userPrompt = "Recommend block buster movies for actor %s".formatted(actor);<br> return chatClient.prompt().user(userPrompt).call().entity(ActorMovies.class);<br> }<br>
Code language: Java (java)
Following is the response from Ollama LLM Model
GET http://localhost:8080/chatai/moviesRecommendationOfActor?actor="pierce brosnan"
HTTP/1.1 200
Content-Type: application/json
Transfer-Encoding: chunked
Date: Sun, 25 Aug 2024 09:13:06 GMT
{
"actor": "Pierce Brosnan",
"movies": [
"The Thomas Crown Affair (1999)",
"Evelyn (2002)",
"Lathammer (2004)",
"Mamma Mia! (2008)",
"I Don't Feel at Home in This Mutilated Subdivision (2017)"
]
}
There is also an overloaded entity
method with the signature entity(ParameterizedTypeReference<T> type)
that lets you specify types such as generic Lists:
@GetMapping("/moviesRecommendationOfActors")<br> public List<ActorMovies> getMoviesRecommendationOfActors(@RequestParam(value = "actor") String actor) {<br> String userPrompt = "Recommend block buster movies for actor %s and %s".formatted("pierce" +<br> " brasnon", "johnny depp");<br> return chatClient.prompt().user(userPrompt).call().entity(new ParameterizedTypeReference<List<ActorMovies>>() {});<br> }
Code language: Java (java)
In the response you can see that we are getting list of objects
GET http://localhost:8080/chatai/moviesRecommendationOfActors
HTTP/1.1 200
Content-Type: application/json
Transfer-Encoding: chunked
Date: Sun, 25 Aug 2024 12:34:51 GMT
[
{
"actor": "Pierce Brosnan",
"movies": [
"The Thomas Crown Affair",
"Mamma Mia!",
"The Ghost Writer"
]
},
{
"actor": "Johnny Depp",
"movies": [
"Pirates of the Caribbean: The Curse of the Black Pearl",
"Charlie and the Chocolate Factory",
"Edward Scissorhands"
]
}
]
Streaming Responses
The stream
lets you get an asynchronous response.
Streaming of responses refers to the real-time generation and display of text, where the output is presented as it’s being generated rather than after the entire response is complete. This simulates a more natural, conversational flow, making the interaction feel quicker and more dynamic.
In interfaces like ChatGPT, the “streaming” feature mimics how humans respond during a conversation. Instead of waiting for the full response to be generated and then displayed all at once, the system sends parts of the response as they’re being processed.
By streaming these words gradually, the user gets an impression that the system is thinking and responding live.
This approach reduces perceived wait times and makes interactions feel more conversational.
@GetMapping("/streamSeriesRecommendation")
public Flux<String> getStreamOfWebSeriesRecommendations(@RequestParam(value = "category") String category) {
String userPrompt = "Recommend me a web series in %s genre".formatted(category);
return chatClient.prompt().user(userPrompt).stream().content();
}
Code language: Java (java)
Using OpenAI ChatGPT models
The above example can also used by OpenAI chat GPT models.
To use the OpenAI GPT models you need to have API key.
To get the API key
Go to https://platform.openai.com/ signup and login
Next go to https://platform.openai.com/settings/organization/billing/overview page and add the payment details and buy the credit for your usage.
Next got to https://platform.openai.com/api-keys and create API key by clicking on the “create new secret key” button and copy the API key.
In application.properties file, replace following properties
spring.ai.ollama.base-url
spring.ai.ollama.chat.options.model
With
spring.ai.openai.api-key=sk-proj-xxx
spring.ai.openai.model=gpt-3.5-turbo
After replacing the properties, when you run the program again you will get the response from Open AI models.
You can download source code for the blog post from GitHub