azure ai foundry
12 TopicsModel Mondays S2:E2 - Understanding Model Context Protocol (MCP)
This week in Model Mondays, we focus on the Model Context Protocol (MCP) — and learn how to securely connect AI models to real-world tools and services using MCP, Azure AI Foundry, and industry-standard authorization. Read on for my recap About Model Mondays Model Mondays is a weekly series designed to help you build your Azure AI Foundry Model IQ step by step. Here’s how it works: 5-Minute Highlights – Quick news and updates about Azure AI models and tools on Monday 15-Minute Spotlight – Deep dive into a key model, protocol, or feature on Monday 30-Minute AMA on Friday – Live Q&A with subject matter experts from Monday livestream If you want to grow your skills with the latest in AI model development, Model Mondays is the place to start. Want to follow along? Register Here - to watch upcoming Mondel Monday livestreams Watch Playlists to replay past Model Monday episodes Register Here - to join the AMA on MCP on Friday Jun 27 Visit The Forum- to view Foundry Friday AMAs and recaps Spotlight On: Model Context Protocol (MCP) This week, the Model Monday’s spotlight was on the Model Context Protocol (MCP) with subject matter expert Den Delimarsky. Don't forget to check out the slides from the presentation, for resource links! In this blog post, I’ll talk about my five key takeaways from this episode: What Is MCP and Why Does It Matter? What Is MCP Authorization and Why Is It Important? How Can I Get Started with MCP? Spotlight: My Aha Moment Highlights: What’s New in Azure AI 1 . What Is MCP and Why is it Important? MCP is a protocol that standardizes how AI applications connect the underlying AI models to required knowledge sources (data) and interaction APIs (functions) for more effective task execution. Because these models are pre-trained, they lack access to real-time or proprietary data sources (for knowledge) and real-world environments (for interaction). MCP allows them to "discover and use" relevant knowledge and action tools to add relevant context to the model for task execution. Explore: The MCP Specification Learn: MCP For Beginners Want to learn more about MCP - check out the AI Engineer World Fair 2025 "MCP and Keynotes" track. It kicks off with a keynote from Asha Sharma that gives you a broader vision for Azure AI Foundry. Then look for the talk from Harald Kirschner on MCP and VS Code. 2. What Is MCP Authorization and Why Does It Matter? MCP (Model Context Protocol) authorization is a system that helps developers manage who can access their apps, especially when they are hosted in the cloud. The goal is to simplify the process of securing these apps by using common tools like OAuth and identity providers (such as Google or GitHub), so developers don't have to be security experts. Key Takeaways: The new MCP proposal uses familiar identity providers to simplify the authorization process. It allows developers to secure their apps without requiring deep knowledge of security. The update ensures better security controls and prepares the system for future authentication methods. Related Reading: Aaron Parecki, Let's Fix OAuth in MCP Den Delimarsky, Improving The MCP Authorization Spec - One RFC At A Time MCP Specification, Authorization protocol draft On Monday, Den joined us live to talk about the work he did for the authorization protocol. Watch the session now to get a sense for what the MCP Authorization protocol does, how it works, and why it matters. Have questions? Submit them to the forum or Join the Foundry Friday AMA on Jun 27 at 1:30pm ET. 3. How Can I Get Started? If you want to start working with MCP, here’s how to do it easily: Learn the Fundamentals: Explore MCP For Beginners Use an MCP Server: Explore VSCode Agent Mode support . Use MCP with AI Agents: Explore the Azure MCP Server 4. What’s New in Azure AI Foundry? Managed Compute for Cohere Models: Faster, secure AI deployments with low latency. Prompt Shields: New Azure security system to protect against prompt injection and unsafe content. OpenAI o3 Pro Model: A fast, low-cost model similar to GPT-4 Turbo. Codex Mini Model: A smaller, quicker model perfect for developer command-line tasks. MCP Security Upgrades: Now easier to secure AI apps using familiar OAuth identity providers. 5. My Aha Moment Before this session, I used to think that connecting apps to AI was complicated and risky. I believed developers had to build their own security systems from scratch, which sounded tough. But this week, I learned that MCP makes it simple. We can now use trusted logins like Google or GitHub and securely connect AI models to real-world apps without extra hassle. How I Learned This ? To be honest, I also used Copilot to help me understand and summarize this topic in simple words. I wanted to make sure I really understood it well enough to explain it to my friends and peers. I believe in learning with the tools we have, and AI is one of them. By using Copilot and combining it with what I learned from the Model Monday’s session, I was able to write this blog in a way that is easy to understand Takeaway for Beginners: It’s okay to use AI to learn what matters is that you grow, verify, and share the knowledge in your own way. Coming Up Next Week: Next week, we dive into SLMs & Reasoning (Phi-4) with Mojan Javaheripi, PhD, Senior Researcher at Microsoft Research. This session will explore how Small Language Models (SLMs) can perform advanced reasoning tasks, and what makes models like Phi-4 reasoning efficient, scalable, and useful in practical AI applications. Register Here! Join The Community Great devs don't build alone! In a fast-pased developer ecosystem, there's no time to hunt for help. That's why we have the Azure AI Developer Community. Join us today and let's journey together! Join the Discord - for real-time chats, events & learning Explore the Forum - for AMA recaps, Q&A, and help! About Me: I'm Sharda, a Gold Microsoft Learn Student Ambassador interested in cloud and AI. Find me on Github, Dev.to, Tech Community and Linkedin. In this blog series I have summarized my takeaways from this week's Model Mondays livestream.91Views0likes0CommentsCreate Stunning AI Videos with Sora on Azure AI Foundry!
Special credit to Rory Preddy for creating the GitHub resource that enable us to learn more about Azure Sora. Reach him out on LinkedIn to say thanks. Introduction Artificial Intelligence (AI) is revolutionizing content creation, and video generation is at the forefront of this transformation. OpenAI's Sora, a groundbreaking text-to-video model, allows creators to generate high-quality videos from simple text prompts. When paired with the powerful infrastructure of Azure AI Foundry, you can harness Sora's capabilities with scalability and efficiency, whether on a local machine or a remote setup. In this blog post, I’ll walk you through the process of generating AI videos using Sora on Azure AI Foundry. We’ll cover the setup for both local and remote environments. Requirements: Azure AI Foundry with sora model access A Linux Machine/VM. Make sure that the machine already has the package below: Java JRE 17 (Recommended) OR later Maven Step Zero – Deploying the Azure Sora model on AI Foundry Navigate to the Azure AI Foundry portal and head to the “Models + Endpoints” section (found on the left side of the Azure AI Foundry portal) > Click on the “Deploy Model” button > “Deploy base model” > Search for Sora > Click on “Confirm”. Give a deployment name and specify the Deployment type > Click “Deploy” to finalize the configuration. You should receive an API endpoint and Key after successful deploying Sora on Azure AI Foundry. Store these in a safe place because we will be using them in the next steps. Step one – Setting up the Sora Video Generator in the local/remote machine. Clone the roryp/sora repository on your machine by running the command below: git clone https://github.com/roryp/sora.git cd sora Then, edit the application.properties file in the src/main/resources/ folder to include your Azure OpenAI Credentials. Change the configuration below: azure.openai.endpoint=https://your-openai-resource.cognitiveservices.azure.com azure.openai.api-key=your_api_key_here If port 8080 is used for another application, and you want to change the port for which the web app will run, change the “server.port” configuration to include the desired port. Allow appropriate permissions to run the “mvnw” script file. chmod +x mvnw Run the application ./mvnw spring-boot:run Open your browser and type in your localhost/remote host IP (format: [host-ip:port]) in the browser search bar. If you are running a remote host, please do not forget to update your firewall/NSG to allow inbound connection to the configured port. You should see the web app to generate video with Sora AI using the API provided on Azure AI Foundry. Now, let’s generate a video with Sora Video Generator. Enter a prompt in the first text field, choose the video pixel resolution, and set the video duration. (Due to technical limitation, Sora can only generate video of a maximum of 20 seconds). Click on the “Generate video” button to proceed. The cost to generate the video should be displayed below the “Generate Video” button, for transparency purposes. You can click on the “View Breakdown” button to learn more about the cost breakdown. The video should be ready to download after a maximum of 5 minutes. You can check the status of the video by clicking on the “Check Status” button on the web app. The web app will inform you once the download is ready and the page should refresh every 10 seconds to fetch real-time update from Sora. Once it is ready, click on the “Download Video” button to download the video. Conclusion Generating AI videos with Sora on Azure AI Foundry is a game-changer for content creators, marketers, and developers. By following the steps outlined in this guide, you can set up your environment, integrate Sora, and start creating stunning AI-generated videos. Experiment with different prompts, optimize your workflow, and let your imagination run wild! Have you tried generating AI videos with Sora or Azure AI Foundry? Share your experiences or questions in the comments below. Don’t forget to subscribe for more AI and cloud computing tutorials!230Views0likes0CommentsS2E01 Recap: Advanced Reasoning Session
About Model Mondays Want to know what Reasoning models are and how you can build advanced reasoning scenarios like a Deep Research agent using Azure AI Foundry? Check out this recap from Model Mondays Season 2 Ep 1. Model Mondays is a weekly series to help you build your model IQ in three steps: 1. Catch the 5-min Highlights on Monday, to get up to speed on model news 2. Catch the 15-min Spotlight on Monday, for a deep-dive into a model or tool 3. Catch the 30-min AMA on Friday, for a Q&A session with subject matter experts Want to follow along? Register Here- to watch upcoming livestreams for Season 2 Visit The Forum- to see the full AMA schedule for Season 2 Register Here - to join the AMA on Friday Jun 20 Spotlight On: Advanced Reasoning This week, the Model Mondays spotlight was on Advanced Reasoning with subject matter expert Marlene Mhangami. In this blog post, I'll talk about my five takeaways from this episode: Why Are Reasoning Models Important? What Is an Advanced Reasoning Scenario? How Can I Get Started with Reasoning Models ? Spotlight: My Aha Moment Highlights: What’s New in Azure AI 1. Why Are Reasoning Models Important? In today's fast-evolving AI landscape, it's no longer enough for models to just complete text or summarize content. We need AI that can: Understand multi-step tasks Make decisions based on logic Plan sequences of actions or queries Connect context across turns Reasoning models are large language models (LLMs) trained with reinforcement learning techniques to "think" before they answer. Rather than simply generating a response based on probability, these models follow an internal thought process producing a chain of reasoning before responding. This makes them ideal for complex problem-solving tasks. And they’re the foundation of building intelligent, context-aware agents. They enable next-gen AI workflows in everything from customer support to legal research and healthcare diagnostics. Reason: They allow AI to go beyond surface-level response and deliver solutions that reflect understanding, not just language patterning. 2. What does Advanced Reasoning involve? An advanced reasoning scenario is one where a model: Breaks a complex prompt into smaller steps Retrieves relevant external data Uses logic to connect dots Outputs a structured, reasoned answer Example: A user asks: What are the financial and operational risks of expanding a startup to Southeast Asia in 2025? This is the kind of question that requires extensive research and analysis. A reasoning model might tackle this by: Retrieving reports on Southeast Asia market conditions Breaking down risks into financial, political, and operational buckets Cross-referencing data with recent trends Returning a reasoned, multi-part answer 3. How Can I Get Started with Reasoning Models? To get started, you need to visit a catalog that has examples of these models. Try the GitHub Models Marketplace and look for the reasoning category in the filter. Try the Azure AI Foundry model catalog and look for reasoning models by name. Example: The o-series of models from Azure Open AI The DeepSeek-R1 models The Grok 3 models The Phi-4 reasoning models Next, you can use SDKs or Playground for exploring the model capabiliies. 1. Try Lab 331 - for a beginner-friendly guide. 2. Try Lab 333 - for an advanced project. 3. Try the GitHub Model Playground - to compare reasoning and GPT models. 4. Try the Deep Research Agent using LangChain - sample as a great starting project. Have questions or comments? Join the Friday AMA on Azure AI Foundry Discord: 4. Spotlight: My Aha Moment Before this session, I thought reasoning meant longer or more detailed responses. But this session helped me realize that reasoning means structured thinking — models now plan, retrieve, and respond with logic. This inspired me to think about building AI agents that go beyond chat and actually assist users like a teammate. It also made me want to dive deeper into LangChain + Azure AI workflows to build mini-agents for real-world use. 5. Highlights: What’s New in Azure AI Here’s what’s new in the Azure AI Foundry: Direct From Azure Models - Try hosted models like OpenAI GPT on PTU plans SORA Video Playground - Generate video from prompts via SORA models Grok 3 Models - Now available for secure, scalable LLM experiences DeepSeek R1-0528 - A reasoning-optimized, Microsoft-tuned open-source model These are all available in the Azure Model Catalog and can be tried with your Azure account. Did You Know? Your first step is to find the right model for your task. But what if you could have the model automatically selected for you_ based on the prompt you provide? That's the magic of Model Router a deployable AI chat model that dynamically selects the best LLM based on your prompt. Instead of choosing one model manually, the Router makes that choice in real time. Currently, this works with a fixed set of Azure OpenAI models, including a reasoning model option. Keep an eye on the documentation for more updates. Why it’s powerful: Saves cost by switching between models based on complexity Optimizes performance by selecting the right model for the task Lets you test and compare model outputs quickly Try it out in Azure AI Foundry or read more in the Model Catalog Coming Up Next Next week, we dive into Model Context Protocol, an open protocol that empowers agentic AI applications by making it easier to discover and integrate knowledge and action tools with your model choices. Register Here to get reminded - and join us live on Monday! Join The Community Great devs don't build alone! In a fast-pased developer ecosystem, there's no time to hunt for help. That's why we have the Azure AI Developer Community. Join us today and let's journey together! Join the Discord - for real-time chats, events & learning Explore the Forum - for AMA recaps, Q&A, and help! About Me. I'm Sharda, a Gold Microsoft Learn Student Ambassador interested in cloud and AI. Find me on Github, Dev.to,, Tech Community and Linkedin. In this blog series I have summarizef my takeaways from this week's Model Mondays livestream .189Views0likes0CommentsConfigure Embedding Models on Azure AI Foundry with Open Web UI
Introduction Let’s take a closer look at an exciting development in the AI space. Embedding models are the key to transforming complex data into usable insights, driving innovations like smarter chatbots and tailored recommendations. With Azure AI Foundry, Microsoft’s powerful platform, you’ve got the tools to build and scale these models effortlessly. Add in Open Web UI, a intuitive interface for engaging with AI systems, and you’ve got a winning combo that’s hard to beat. In this article, we’ll explore how embedding models on Azure AI Foundry, paired with Open Web UI, are paving the way for accessible and impactful AI solutions for developers and businesses. Let’s dive in! To proceed with configuring the embedding model from Azure AI Foundry on Open Web UI, please firstly configure the requirements below. Requirements: Setup Azure AI Foundry Hub/Projects Deploy Open Web UI – refer to my previous article on how you can deploy Open Web UI on Azure VM. Optional: Deploy LiteLLM with Azure AI Foundry models to work on Open Web UI - refer to my previous article on how you can do this as well. Deploying Embedding Models on Azure AI Foundry Navigate to the Azure AI Foundry site and deploy an embedding model from the “Model + Endpoint” section. For the purpose of this demonstration, we will deploy the “text-embedding-3-large” model by OpenAI. You should be receiving a URL endpoint and API Key to the embedding model deployed just now. Take note of that credential because we will be using it in Open Web UI. Configuring the embedding models on Open Web UI Now head to the Open Web UI Admin Setting Page > Documents and Select Azure Open AI as the Embedding Model Engine. Copy and Paste the Base URL, API Key, the Embedding Model deployed on Azure AI Foundry and the API version (not the model version) into the fields below: Click “Save” to reflect the changes. Expected Output Now let us look into the scenario for when the embedding model configured on Open Web UI and when it is not. Without Embedding Models configured. With Azure Open AI Embedding models configured. Conclusion And there you have it! Embedding models on Azure AI Foundry, combined with the seamless interaction offered by Open Web UI, are truly revolutionizing how we approach AI solutions. This powerful duo not only simplifies the process of building and deploying intelligent systems but also makes cutting-edge technology more accessible to developers and businesses of all sizes. As we move forward, it’s clear that such integrations will continue to drive innovation, breaking down barriers and unlocking new possibilities in the AI landscape. So, whether you’re a seasoned developer or just stepping into this exciting field, now’s the time to explore what Azure AI Foundry and Open Web UI can do for you. Let’s keep pushing the boundaries of what’s possible!461Views0likes0CommentsSkill Up On The Latest AI Models & Tools on Model Mondays - Season 2 starts Jun 16!
Quick Links To RSVP for each episode: EP1: Advanced Reasoning Models: https://developer.microsoft.com/en-us/reactor/events/25905/ EP2: Model Context Protocol: https://developer.microsoft.com/en-us/reactor/events/25906/ EP3: SLMs (and Reasoning): https://developer.microsoft.com/en-us/reactor/events/25907/ Get All The Details: https://aka.ms/model-mondays Azure AI Foundry offers the best model choice Did you manage to catch up on all the talks from Microsoft Build 2025? If, like me, you are interested in building AI-driven applications on Azure, you probably started by looking at what’s new in Azure AI Foundry. I recommend you read Asha Sharma’s post for the top 10 things you need to know in this context. And it starts with New Models & Smarter Models! New Models | Azure AI Foundry now has 11,000+ models to choose from – including frontier models from partner providers, and thousands of open-source community variants from Hugging Face. But how do you pick the right one for your needs? Smarter Models | It's not just about model selection, but about the effort required to use the model and validate the results. How can you improve the user experience and build trust and confidence in your customers? Solutions like Model Router and Azure AI Evaluations SDK help. The Challenge? | New models, features, and tools being released daily - the information overload is real. How do we keep up with the latest updates – and skill up on model choices? Say Hello to Model Mondays! Model Mondays is a weekly series with a livestream on Microsoft Reactor (on Mondays) and a follow-up AMA on Azure AI Foundry Discord (on Fridays). Here are the three links to know: Visit the Model Mondays repo: https://aka.ms/model-mondays Watch the Model Mondays playlist: https://aka.ms/model-mondays/playlist Join #model-mondays on Discord: https://aka.ms/model-mondays/discord Visit the playlist or repo to catch up on replays from Season 1 (above). Learn about topics like reasoning models, visual generative models, open-source AI, forecasting models, and local AI development with Visual Studio AI Toolkit. Each 30-minute episode consists of: 5-min Highlights. Catch up on top model-related news from the previous week. 15-min Spotlight. Get a deep dive into a specific model, model family, or related tool. Q&A. Ask questions on chat during the livestream, or join the AMA on Discord on Friday. Register Now to Join Us for Season 2! Microsoft Build showed us that the model catalog is expanding quickly – and so are the tools that are available, to help you select, customize, and evaluate, your AI application. In Season 2, we’re going to dive deeper into advanced topics in models and tools. Some of the topics that we hope to cover include: Advanced Reasoning Models – Deep Research, Visual Reasoning and more. Model Context Protocol – What it is, examples of MCP Servers today. SLMs and Reasoning – We dive into the Phi-4 model ecosystem for insights. Foundry Labs – Explore projects like Magentic-UI and MCP Server. Open-Source Models – Explore the 10K+ community models from Hugging Face. Edge Models – Explore Foundry Local and the rise of on-device AI capabiltiies. Models for AI Agents – Explore the Agent Catalog samples like Red-Teaming Model Playgrounds – Explore Image, Video, Agent, Chat, Language playgrounds Advanced Fine Tuning – Learn to fine GPT models, and use the Foundry Portal AI Developer Experience – Get productive with AI Toolkit & VS Code Extension pack Our first three episodes below are open for registration right now! EP1: Advanced Reasoning Models: https://developer.microsoft.com/en-us/reactor/events/25905/ EP2: Model Context Protocol: https://developer.microsoft.com/en-us/reactor/events/25906/ EP3: SLMs (and Reasoning): https://developer.microsoft.com/en-us/reactor/events/25907/ Let's build our model IQ and get starting developing AI applications on Azure! Want to Build AI Apps - and need resources to accelerate your journey? Chat with us on Azure AI Foundry Discord - https://aka.ms/aifoundrydiscord Provide feedback on our Discussion Forum - https://aka.ms/aifoundryforum Skill up with the Azure AI Foundry Learn Course - http://aka.ms/learnatbuild Review the Azure AI Foundry Documentation - http://aka.ms/AzureAI Download and explore the Azure AI Foundry SDK - http://aka.ms/aifoundrysdk142Views0likes3CommentsStep-by-step: Integrate Ollama Web UI to use Azure Open AI API with LiteLLM Proxy
Introductions Ollama WebUI is a streamlined interface for deploying and interacting with open-source large language models (LLMs) like Llama 3 and Mistral, enabling users to manage models, test them via a ChatGPT-like chat environment, and integrate them into applications through Ollama’s local API. While it excels for self-hosted models on platforms like Azure VMs, it does not natively support Azure OpenAI API endpoints—OpenAI’s proprietary models (e.g., GPT-4) remain accessible only through OpenAI’s managed API. However, tools like LiteLLM bridge this gap, allowing developers to combine Ollama-hosted models with OpenAI’s API in hybrid workflows, while maintaining compliance and cost-efficiency. This setup empowers users to leverage both self-managed open-source models and cloud-based AI services. Problem Statement As of February 2025, Ollama WebUI, still do not support Azure Open AI API. The Ollama Web UI only support self-hosted Ollama API and managed OpenAI API service (PaaS). This will be an issue if users want to use Open AI models they already deployed on Azure AI Foundry. Objective To integrate Azure OpenAI API via LiteLLM proxy into with Ollama Web UI. LiteLLM translates Azure AI API requests into OpenAI-style requests on Ollama Web UI allowing users to use OpenAI models deployed on Azure AI Foundry. If you haven’t hosted Ollama WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Ollama WebUI deployed already. Step 1: Deploy OpenAI models on Azure Foundry. If you haven’t created an Azure AI Hub already, search for Azure AI Foundry on Azure, and click on the “+ Create” button > Hub. Fill out all the empty fields with the appropriate configuration and click on “Create”. After the Azure AI Hub is successfully deployed, click on the deployed resources and launch the Azure AI Foundry service. To deploy new models on Azure AI Foundry, find the “Models + Endpoints” section on the left hand side and click on “+ Deploy Model” button > “Deploy base model” A popup will appear, and you can choose which models to deploy on Azure AI Foundry. Please note that the o-series models are only available to select customers at the moment. You can request access to the o-series models by completing this request access form, and wait until Microsoft approves the access request. Click on “Confirm” and another popup will emerge. Now name the deployment and click on “Deploy” to deploy the model. Wait a few moments for the model to deploy. Once it successfully deployed, please save the “Target URI” and the API Key. Step 2: Deploy LiteLLM Proxy via Docker Container Before pulling the LiteLLM Image into the host environment, create a file named “litellm_config.yaml” and list down the models you deployed on Azure AI Foundry, along with the API endpoints and keys. Replace "API_Endpoint" and "API_Key" with “Target URI” and “Key” found from Azure AI Foundry respectively. Template for the “litellm_config.yaml” file. model_list: - model_name: [model_name] litellm_params: model: azure/[model_name_on_azure] api_base: "[API_ENDPOINT/Target_URI]" api_key: "[API_Key]" api_version: "[API_Version]" Tips: You can find the API version info at the end of the Target URI of the model's endpoint: Sample Endpoint - https://example.openai.azure.com/openai/deployments/o1-mini/chat/completions?api-version=2024-08-01-preview Run the docker command below to start LiteLLM Proxy with the correct settings: docker run -d \ -v $(pwd)/litellm_config.yaml:/app/config.yaml \ -p 4000:4000 \ --name litellm-proxy-v1 \ --restart always \ ghcr.io/berriai/litellm:main-latest \ --config /app/config.yaml --detailed_debug Make sure to run the docker command inside the directory where you created the “litellm_config.yaml” file just now. The port used to listen for LiteLLM Proxy traffic is port 4000. Now that LiteLLM proxy had been deployed on port 4000, lets change the OpenAI API settings on Ollama WebUI. Navigate to Ollama WebUI’s Admin Panel settings > Settings > Connections > Under the OpenAI API section, write http://127.0.0.1:4000 as the API endpoint and set any key (You must write anything to make it work!). Click on “Save” button to reflect the changes. Refresh the browser and you should be able to see the AI models deployed on the Azure AI Foundry listed in the Ollama WebUI. Now let’s test the chat completion + Web Search capability using the "o1-mini" model on Ollama WebUI. Conclusion Hosting Ollama WebUI on an Azure VM and integrating it with OpenAI’s API via LiteLLM offers a powerful, flexible approach to AI deployment, combining the cost-efficiency of open-source models with the advanced capabilities of managed cloud services. While Ollama itself doesn’t support Azure OpenAI endpoints, the hybrid architecture empowers IT teams to balance data privacy (via self-hosted models on Azure AI Foundry) and cutting-edge performance (using Azure OpenAI API), all within Azure’s scalable ecosystem. This guide covers every step required to deploy your OpenAI models on Azure AI Foundry, set up the required resources, deploy LiteLLM Proxy on your host machine and configure Ollama WebUI to support Azure AI endpoints. You can test and improve your AI model even more with the Ollama WebUI interface with Web Search, Text-to-Image Generation, etc. all in one place.5.8KViews1like4CommentsPower Up Your Open WebUI with Azure AI Speech: Quick STT & TTS Integration
Introduction Ever found yourself wishing your web interface could really talk and listen back to you? With a few clicks (and a bit of code), you can turn your plain Open WebUI into a full-on voice assistant. In this post, you’ll see how to spin up an Azure Speech resource, hook it into your frontend, and watch as user speech transforms into text and your app’s responses leap off the screen in a human-like voice. By the end of this guide, you’ll have a voice-enabled web UI that actually converses with users, opening the door to hands-free controls, better accessibility, and a genuinely richer user experience. Ready to make your web app speak? Let’s dive in. Why Azure AI Speech? We use Azure AI Speech service in Open Web UI to enable voice interactions directly within web applications. This allows users to: Speak commands or input instead of typing, making the interface more accessible and user-friendly. Hear responses or information read aloud, which improves usability for people with visual impairments or those who prefer audio. Provide a more natural and hands-free experience especially on devices like smartphones or tablets. In short, integrating Azure AI Speech service into Open Web UI helps make web apps smarter, more interactive, and easier to use by adding speech recognition and voice output features. If you haven’t hosted Open WebUI already, follow my other step-by-step guide to host Ollama WebUI on Azure. Proceed to the next step if you have Open WebUI deployed already. Learn More about OpenWeb UI here. Deploy Azure AI Speech service in Azure. Navigate to the Azure Portal and search for Azure AI Speech on the Azure portal search bar. Create a new Speech Service by filling up the fields in the resource creation page. Click on “Create” to finalize the setup. After the resource has been deployed, click on “View resource” button and you should be redirected to the Azure AI Speech service page. The page should display the API Keys and Endpoints for Azure AI Speech services, which you can use in Open Web UI. Settings things up in Open Web UI Speech to Text settings (STT) Head to the Open Web UI Admin page > Settings > Audio. Paste the API Key obtained from the Azure AI Speech service page into the API key field below. Unless you use different Azure Region, or want to change the default configurations for the STT settings, leave all settings to blank. Text to Speech settings (TTS) Now, let's proceed with configuring the TTS Settings on OpenWeb UI by toggling the TTS Engine to Azure AI Speech option. Again, paste the API Key obtained from Azure AI Speech service page and leave all settings to blank. You can change the TTS Voice from the dropdown selection in the TTS settings as depicted in the image below: Click Save to reflect the change. Expected Result Now, let’s test if everything works well. Open a new chat / temporary chat on Open Web UI and click on the Call / Record button. The STT Engine (Azure AI Speech) should identify your voice and provide a response based on the voice input. To test the TTS feature, click on the Read Aloud (Speaker Icon) under any response from Open Web UI. The TTS Engine should reflect Azure AI Speech service! Conclusion And that’s a wrap! You’ve just given your Open WebUI the gift of capturing user speech, turning it into text, and then talking right back with Azure’s neural voices. Along the way you saw how easy it is to spin up a Speech resource in the Azure portal, wire up real-time transcription in the browser, and pipe responses through the TTS engine. From here, it’s all about experimentation. Try swapping in different neural voices or dialing in new languages. Tweak how you start and stop listening, play with silence detection, or add custom pronunciation tweaks for those tricky product names. Before you know it, your interface will feel less like a web page and more like a conversation partner.431Views1like0CommentsShowcasing Phi-4-Reasoning: A Game-Changer for AI Developers
Showcasing Phi-4-Reasoning: A Game-Changer for AI Developers Introduction Phi-4-Reasoning is a state-of-the-art AI model developed by Microsoft Research, designed to excel in complex reasoning tasks. With its advanced capabilities, Phi-4-Reasoning is a powerful tool for AI developers, enabling them to tackle intricate problems with ease and precision. What is Phi-4-Reasoning? Phi-4-Reasoning is a 14-billion parameter open-weight reasoning model that has been fine-tuned from the Phi-4 model using supervised fine-tuning on a dataset of chain-of-thought traces.. We are also releasing Phi-4-reasoning-plus, a variant enhanced through a short phase of outcome-based reinforcement learning that offers higher performance by generating longer reasoning traces. This model is designed to handle complex reasoning tasks, including mathematical problems, scientific queries, and coding challenges Why Phi-4-reasoning and Phi-4-reasoning-plus Excels Advanced Reasoning Capabilities: Phi-4-Reasoning is trained to generate detailed reasoning chains, making it highly effective in solving complex problems. It leverages additional inference-time compute to enhance its reasoning abilities High-Quality Data Curation: The model's training involved meticulous data curation and the use of high-quality synthetic datasets, allowing it to compete with much larger models Performance: Despite its smaller size, Phi-4-Reasoning outperforms larger models like OpenAI's o1-mini and DeepSeek-R1-Distill-Llama-70B on various benchmarks, including mathematical reasoning and PhD-level science questions Industry Applications: Phi-4-Reasoning is versatile and can be deployed in various industry scenarios, including medicine, where it can be used for reasoning tasks on edge devices Real-World Examples Wordplay Riddle: Phi-4-Reasoning can solve wordplay riddles by recognizing patterns and applying logical reasoning. For example, it deduced that collecting three strawberries would yield 9 r's because the word "strawberry" contains three r's Example 1: How many strawberries for 9 r’s? User: How many strawberries do I need to collect 9 r’s? Phi-4-Reasoning: <think> TRUNCATED </think> It’s a wordplay riddle! Notice that the word “strawberry” has three r’s in it. So if you “collect” three strawberries, you get 3 Å~ 3 = 9 r’s. Thus, you need 3 strawberries. Solving a Riddle: The model can decrypt complex riddles by identifying patterns and performing arithmetic operations. It demonstrated this capability by solving a riddle involving foreign language number-words and reversed number-words. What This Problem Proves: Pattern Recognition: Phi-4-Reasoning demonstrates its ability to identify patterns in words and apply logical reasoning to solve the riddle. Wordplay Understanding: The model showcases its proficiency in understanding and solving wordplay riddles, which require a combination of linguistic knowledge and logical deduction. Efficiency: Phi-4-Reasoning quickly arrives at the correct solution by applying a straightforward multiplication operation, highlighting its efficiency in solving such problems. Example 2: Solving a Riddle User: I will give you a riddle to solve with a few examples, and something to complete at the end What is a likely completion for X that is consistent the with examples above? Phi-4-Reasoning: <think> TRUNCATED </think> We can “decrypt” the riddle by noticing that each line has five parts. In every example the parts are: A “number-word” in a foreign language (or with an extra minus sign), A “first name” (of any nationality), A “reversed number-word” (or two, joined by a hyphen), A “second name”, and A final number. When you “reverse” the third word you get a familiar numberin English: In line 1 “evif” is “five” (5) and the first number is “uno” = 1, and 5 – 1 = 4. In line 2 “neves” is “seven” (7) and “tres” = 3, and 7 – 3 = 4. In line 3 “evlewt” is “twelve” (12) and “drei” = 3, and 12 – 3 = 9. In lines 4 and 5 with two reversed “number-words”: – Line 4: “xis” →“six” (6) and “ytnewt” →“twelve” (12); 6 + 12 = 18, plus 3 = 21. – Line 5: “eerht” → “three” (3) and “ytnewt” → “twelve” (12); 3 + 12 = 15, plus 3 = 18. Now look at the final line: “minus one-point-two Satya eno Bill X” It follows the same five–part format. Notice that “eno” reversed is “one” (1). And the “first number” is “minus onepoint-two” (–1.2). Since for a one–number–word line the rule is: final number = (number from reversed word)−(first number) we compute: final = 1 − (−1.2) = 1 + 1.2 = 2.2 Thus, a likely completion for X is 2.2. Phi-4-Reasoning: Phi-4-Reasoning decrypts the riddle by identifying a pattern in the examples provided. Each line consists of five parts: a number-word in a foreign language, a first name, a reversed number-word, a second name, and a final number. By reversing the third word and performing arithmetic operations, Phi-4-Reasoning determines that the likely completion for X is 2.2 What This Problem Proves: Complex Pattern Recognition: Phi-4-Reasoning excels in recognizing complex patterns involving multiple elements, such as foreign language number-words and reversed number-words. Logical Deduction: The model demonstrates its ability to apply logical deduction to solve the riddle by performing arithmetic operations based on the identified patterns. Versatility: Phi-4-Reasoning showcases its versatility in handling diverse types of riddles and problems, making it a valuable tool for AI developers. Example 3. solve complex graph traversal tasks, such as finding a path through a maze. The maze is represented as an adjacency list, which is a common way to describe graphs in computer science. Each node in the graph (or maze) is connected to other nodes, and the task is to find a valid path from the starting node (node 0) to the destination node Example 3: Solving a maze User: Please solve the maze provided in adjacency list form below, by outputting a valid path from node 0 to node {destination}, in a space-separated list of integers. Adjacency list: {depicted as an image below} Phi-4-Reasoning: <think> TRUNCATED </think> {path depicted as an image below} What This Problem Proves: Graph Traversal Capability: Phi-4-Reasoning can effectively navigate through graphs using algorithms like Depth-First Search (DFS) or Breadth-First Search (BFS). This capability is crucial for solving problems that involve finding paths, cycles, or connectivity in graphs. Logical Reasoning: The model demonstrates its ability to apply logical reasoning to determine the correct sequence of nodes to traverse from the start to the destination. This involves understanding the structure of the graph and making decisions based on the connections between nodes. Pattern Recognition: Phi-4-Reasoning can recognize patterns in the adjacency list and use them to find a solution. This is important for tasks that require identifying and following specific paths or routes. Versatility: The ability to solve a maze using an adjacency list showcases the model's versatility in handling different types of data structures and problem-solving scenarios. This is beneficial for AI developers who need to work with various data representations and algorithms. Efficiency: The model's ability to quickly and accurately find a valid path through the maze highlights its efficiency in solving complex problems. This is valuable for applications that require fast and reliable solutions. Conclusion: Phi-4-Reasoning's ability to solve a maze using an adjacency list demonstrates its advanced reasoning capabilities, making it a powerful tool for AI developers. Its proficiency in graph traversal, logical reasoning, pattern recognition, versatility, and efficiency makes it well-suited for tackling a wide range of complex problems. Deployment and Integration Phi-4-Reasoning can be deployed on various platforms, including Azure AI Foundry and Hugging Face. It supports quantization using tools like Microsoft Olive, making it suitable for deployment on edge devices such as IoT, laptops, and mobile devices. Phi-4-Reasoning is a groundbreaking AI model that offers advanced reasoning capabilities, high performance, and versatility. Its ability to handle complex reasoning tasks makes it an invaluable tool for AI developers, enabling them to create innovative solutions across various industries. References Make Phi-4-mini-reasoning more powerful with industry reasoning on edge devices | Microsoft Community Hub Phi-4 Reasoning Technical Paper Phi-4-Mini-Reasoning Technical Paper One year of Phi: Small language models making big leaps in AI | Microsoft Azure Blog PhiCookBook Access Phi-4-reasoning models Phi Models at Azure AI Foundry Models Phi Models on Hugging Face Phi Models on GitHub Marketplace Models3.1KViews0likes0CommentsAI Agents: Planning and Orchestration with the Planning Design Pattern - Part 7
This blog post, Part 7 in a series on AI agents, focuses on the Planning Design Pattern for effective task orchestration. It explains how to define clear goals, decompose complex tasks into manageable subtasks, and leverage structured output (e.g., JSON) for seamless communication between agents. The post includes code snippets demonstrating how to create a planning agent, orchestrate multi-agent workflows, and implement iterative planning for dynamic adaptation. It also links to a practical example notebook (07-autogen.ipynb) and further resources like AutoGen Magnetic One, encouraging readers to explore advanced planning concepts. Links to the previous posts in the series are provided for easy access to foundational AI agent concepts.839Views1like0CommentsStep-by-Step Tutorial: Building an AI Agent Using Azure AI Foundry
This blog post provides a comprehensive tutorial on building an AI agent using Azure AI Agent service and the Azure AI Foundry portal. AI agents represent a powerful new paradigm in application development, offering a more intuitive and dynamic way to interact with software. They can understand natural language, reason about user requests, and take actions to fulfill those requests. This tutorial will guide you through the process of creating and deploying an intelligent agent on Azure. We'll cover setting up an Azure AI Foundry hub, crafting effective instructions to define the agent's behavior, including recognizing user intent, processing requests, and generating helpful responses. We'll also discuss testing the agent's conversational abilities and provide additional resources for expanding your knowledge of AI agents and the Azure AI ecosystem. This hands-on guide is perfect for anyone looking to explore the practical application of Azure's conversational AI capabilities and build intelligent virtual assistants. Join us as we dive into the exciting world of AI agents.8.8KViews1like2Comments