management
658 TopicsAdd Passkey support to Active Directory
Everyone, Please go to the feedback hub and upvote my suggestion to add passkey support to Active Directory Domain Services: https://aka.ms/AAw8z54 The reason I am recommending this is because there needs to be a standard way to use passkeys in an AD environment.105Views1like3CommentsWindows Server OSConfig and DSCv3
Introduction I wanted to formalize putting a post out here to get some discussion going on the attempts at modernization of Windows configuration, and importantly, infrastructure-as-code. Hopefully this is a healthy discussion that others can engage in. Much of what I'm going to try and post about is stuff we already are aware of, but I want to highlight how this is an ongoing concern with the Windows Server platform that makes it difficult to encourage people to even consider Windows in their environment other than for extremely legacy purposes. I want Windows Server to be the best it can be, and I encourage others to join in on the conversation! Problem Statement Windows Server needs a modernized configuration-as-code system. Must be capable of orchestrating without cloud tools (offline orchestration) Must provide for regular validation and attestation Ideally should be easily available to 3rd party configuration tools. Since Microsoft appears to have little interest in building their own modernized system that isn't Azure-based, this means that this MUST be orchestrated easily and securely by 3rd party tools. Should be as robust as GPO at maintaining and enforcing state. Security configurations in Windows are a right pain to manage with any 3rd party tooling, with the closest coming to it being the SecurityDSC module which wraps secedit.exe and security policy INFs. Why is OSConfig not the answer? OSConfig doesn't provide for me, as an engineer, to clearly define what the state of my machines are based on my company's business requirements. While the built-in Microsoft policy recommendations are great, there are reasons to deviate from these policies in a predictable and idempotent manner. Applying an OSConfig Baseline -> Then changing settings as-needed with special PowerShell commands This is not the answer. This is a bunch of imperative code that serves nobody. And it makes implementing this feature extremely challenging in today's modern world of Kubernetes, Docker, etc. I encourage the Windows Server team to engage with the PowerShell team on DSC 3.0. I think that team has it right, but they are a small group of people and do not have the resources to implement everything that would make DSC 3.0 a first-class configuration as code platform on Windows. And this is where the Windows team should come in. Steve Lee and crew have done a bangup job working on DSC 3.0, including taking feedback from folks to leverage Azure Bicep language for configuration. Security Policy Challenge The way to access security policies need to change. Even if I were to take DSC 3.0 I'd end up having to create a similar security policy INF file to import into Windows. It just seems so silly to me to have to write all of that out when Windows really should just provide an interface for doing this. In fact, security policy remains to be one of the largest problems to getting a good platform stood up. Windows Firewall Policy and GPO - The reason why host-based firewalling is painful to manage at scale in a Windows environment. GPO is definitely not the right place to be managing Windows firewall policy at scale. Particularly when you often have a core set of management rules you want to implement and application-specific needs. Making robust changes becomes a challenge since each policy is separate, preventing you from doing things like inheriting rules for higher level policies. While this is an inherent limitation of Group Policy, it highlights the need to get off of GPO as the core policy configuration tool for Windows. My recommendations I'd like for the Windows team to implement DSC 3.0-compatible resources for managing all core functionality of Windows. If you can do it in a GPO, you should be able to do it with Configuration as Code. Please stop relying on the community to make this work. All of this should be first party to the platform itself. Furthermore, I'd like to recommend that Microsoft either work with 3rd party configuration systems (Chef, Ansible, Puppet, Octopus, etc.) OR to also provide a way to hit the ground running. Perhaps something that integrates visually into Windows Admin Center would be nice. Conclusion This is a huge problem in the Windows world and continues to seem to fall on some deaf ears somewhere in the organization. While I no doubt am confident that the engineers on all of these teams very well know these issues and maybe even have discussed fixing them, clearly there's a breakdown somewhere.167Views5likes9CommentsNoob needs help with RDP Services
I am new to Windows server management. I setup a 2019 Server in a VM (Hyper-V). I installed the licenses we got for RDP from MS after installing the Remote Desktop Services. I am getting an error about Remote Desktop Licensing Mode is not configured. Tells me to use Server Manger to specify RD Connection Broker. Either I neglected to install it or configure it, not sure. Articles I find say go to Server Manager -> Remote Desktop Services -> Overview... BUT, that tells me I am logged in with a local account but must use a domain account to manage servers and collections. Again, not using a DC. This server is not part of a domain. We do not run AD internally only AzureAD online. We have 1 program we still run internally and users RDP to it. Should I remove the service and reinstall? What about the licenses I added already? How to I keep them? Any assistance will be greatly appreciated... J15Views0likes0Comments[Public Preview] Dynamically organize your cloud resources with Azure Service Groups!
[Public Preview] Dynamically organize your cloud resources with Azure Service Groups! With Service Groups, you can now leverage flexible cross-subscription grouping, low privilege management, nested resource hierarchies, and data aggregation for practical workloads and application monitoring.Everything New in Azure Governance @ Build 2025
You've come to the right place if you're looking for everything happening with Azure Governance at Microsoft Build, May 19-22, 2025. Azure Governance is an ecosystem of neatly integrated services that provide the ability to ensure speed and control across your cloud environment. From enforcing rules in your cloud environment to querying the state of your resources at-scale, Azure Governance services keep your resources secure and compliant with corporate standards. Join us at Microsoft Build! #MSBuild Session: "Unlock developer agility with a well governed environment" - Thurs, May 22 @ 8:30 AM PDT In a world where app and env requirements are ever changing, maintaining control can be a moving target. Come learn how to empower your developers to achieve more, without compromising on security, compliance, or operational best practices through Azure Governance products. In this session we'll be discussing newly released features within Azure Policy, dive deep into Policy as code, and announce a new grouping construct called Service groups designed to optimize cross subscription management Join the session here: https://aka.ms/AzGovBuild25 Sign up for our #MSBuild Product Roundtable Sessions! Are you going to attend Build 2025 in person in Seattle? If the answer is Yes, Azure product teams would like to invite you to the following Customer Feedback Roundtable sessions at Microsoft Build 2025. Sign up here to join our roundtable sessions: https://aka.ms/AzGovRoundtable. This is a unique opportunity for you to share your insights and help shape the future of Azure. These roundtables will be filled on a first come, first serve basis, so don't miss your chance to sign up now! If you are not attending Build in person, no problem! If you are interested, we would like to invite you to participate in future online feedback sessions. New Releases @ Build 2025 The Azure Governance team is excited to share all the following new features across our product portfolio. For each of the features, you will find an accompanying announcement with scenario details, documentation and blog posts to follow along! Jump to section (New!) Azure Service Groups Azure Policy Azure Machine Configuration Azure Resource Graph (ARG) Azure Resource Manager (ARM) (New!) Azure Service Groups Azure Service Groups - Public Preview A Service Group (SG) is a new grouping structure in Azure that supports flexible grouping of cross-subscription resources and multiple hierarchies of groups. Service Groups provide a unified view and management capabilities, enabling: Low Privilege Management: Service Groups are designed to operate with minimal permissions, ensuring that users can manage resources without needing excessive access rights and appealing to multiple personas. Flexible Cross-Subscription Grouping: Azure resources and scopes, from anywhere in the tenant, can become members of one or multiple service groups. Varying Hierarchies: Service Groups can be self-nested providing the ability to have multiple hierarchy structures of resource containers. Data Aggregation & Views: Aggregate data from resources across subscriptions for practical workloads. View application health (via Health Model) and important data values centered around your wanted perspective. You can reach our team by email at mailto:[email protected] for any questions or comments! TechCommunity Blog: https://aka.ms/servicegroupspreview MS Learn Documentation: http://aka.ms/servicegroups Azure Policy New Features currently in Private Preview Many of the Azure Policy enhancements, including user-based exemptions, caller-type based enforcement (e.g., type user or service principal) and IP filtering are currently in private preview and will soon be available to the public. Stay tuned! Azure Machine Configuration Linux SSH Posture Control Policy - Generally Available We are excited to announce additional built-in capabilities for Linux management scenarios through Azure policy and Machine Configuration. Through new built-in policies, you can manage your SSH configuration settings declaratively at-scale. SSH Posture Control enables you to use the familiar workflows of Azure Policy and Machine Configuration to: Ensure compliance with standards in your industry or organization Reduce attack surface of remote management features Ensure consistent setup across your fleet for security and productivity SSH Posture Control also provides detailed Reasons describing how compliance or non-compliance was determined. These Reasons help you to document compliance for auditors with confidence and evidence. They also enable you to take action when non-compliance is observed. MS Learn Documentation: What is SSH Posture Control? | Microsoft Learn Windows Server 2025 Audit Policy (powered by OSConfig) - Generally Available You can now deploy the Windows Server 2025 security baseline to your environment and ensures that desired security measures are in place, providing a comprehensive and standardized security framework. The Windows Server 2025 baseline includes over 300 security settings to ensure that it meets industry-standard security requirements. It also provides co-management support for both on-premises and Azure Arc-connected devices. The OSConfig tool is a security configuration stack that uses a scenario-based approach to deliver and apply the desired security measures for your environment. MS Learn documentation: Configure security baselines for Windows Server 2025 | Microsoft Learn Onboarding Arc Machines at-scale to Machine Config in Azure Portal - Public Preview With the integration of Machine Configuration audit policies in the Arc at-scale onboarding experience, you can now quickly deploy audit policies to get a deeper look at the security posture of your Arc-enabled servers. Whether you're seeking to test Machine Configuration on an Arc machine or looking to deploy a policy across a broader scope of machines, your deployment workflow just got incredibly easy with this new integration. Azure Resource Graph (ARG) ARG GET/LIST API - Private Preview Now Generally Available (GA) is the Azure Resource Graph GET/LIST API, a highly scalable, fast, and performant alternative to existing control plane GET and List API calls within the Azure ecosystem. This API allows you to mitigate issues related to throttling, such as performance degradation and failed requests offering a 10X higher Read throttling quota to callers, ensuring faster and more efficient read operations for your critical cloud native workload. Contact [email protected] to join the private preview program! Azure Resource Graph Copilot – Generally Available With the release of the Azure Resource Graph (ARG) skill within Copilot, customers can access the ARG query skill through Azure Portal or Github Copilot. Questions about resource governance like “how many Linux VMs do I own” will be sent to the ARG Skill. With this release, customers can easily turn natural language questions into ARG queries. ARG Copilot helps users create queries to quickly surface insights about resources and simplify operational investigations. MS Learn documentation: https://learn.microsoft.com/azure/copilot/get-information-resource-graph Azure Resource Manager (ARM) EU Data Boundary enabled by ARM - Generally Available Going beyond Azure's existing data storage commitments, you can now store and process EU Data in the EU by leveraging Azure data boundaries enabled by Azure Resource Manager. With Azure Resource Manager, you can ensure that in-scope, global Azure metadata data, including EUII, EUPI, Customer Content, and Support Data, are routed, processed, and stored entirely within EU data boundary countries and datacenter locations. This builds on Azure's existing regional metadata privacy commitments and helps our European customers achieve greater control over data locality to meet regulatory, compliance, and sovereignty requirements. MS Learn Documentation: What is the EU Data Boundary? - Microsoft Privacy | Microsoft Learn Stay Updated Keep in touch with Azure Governance products, announcements, and key scenarios. Bookmark the Azure Governance Tech Community Blog, then follow us @AzureGovernance on X (previously known as Twitter) Share Product feedback/ideas with us here- Azure Governance · Community For questions, you can reach us at: Azure Policy: [email protected] Azure Resource Graph: [email protected]Azure Backup for PostgreSQL flexible server: Enterprise-grade solution for resiliency and compliance
At Microsoft Build 2025, Microsoft Azure announced the General Availability (GA) of vaulted backups by Azure Backup for Azure Database for PostgreSQL – flexible server. This solution helps customers in meeting compliance and cyber resiliency needs. Key features: Policy-based scheduled backups that eliminate the need for manual intervention Cyber resiliency features like soft delete of backups, immutability of backup vault and role-based access to backups Security features like data encryption of data at rest and in transit, support for customer-managed keys for encrypting and storing backups Long-term retention of backups (up to 10 years) Redundant storage options of backup data with Zonal and Regional replication Recoverability to paired secondary region via cross region restore Integration with Azure Business Continuity Center which provides single pane of glass for managing backups How it benefits customers? Enhanced Security: Ensures that backups protected from unauthorized access and potential threats by encrypting backup data and storing it in a vault. Cyber Resiliency: Enhances your organization's cyber resiliency by ensuring that your data is safeguarded against cyber threats such as ransomware attacks which could lead to data loss for business. In the event of an attack, you can quickly restore your data from secure backups, minimizing the impact on your operations. Compliance: Meet data compliance requirements by ensuring that backups are stored securely and retained according to regulatory standards. With regulatory bodies updating their data management mandates, this becomes even more crucial. Azure Backup supports retention of backed up data for up to 10 years. At-scale Management: Enterprise grade features and management via Azure Business Continuity Center, which offers a single pane of glass experience to manage, operate and govern all protected resources. How it works? Azure Backup takes full logical backup (using native pg_dump command) of PostgreSQL – flexible server. This approach for vaulted backups relies on native open-source format and this design decision was made for the following reasons: Version agnostic restores: Greater flexibility in restoring backup across different database versions. Open-Source Format: Allows restoration of backups to platform of choice (Azure PostgreSQL flexible server/virtual machines/on-premises/other cloud providers etc.) Backup policies manage schedules and retention, supporting weekly backups. Retention can be set for weekly, monthly, or yearly backups and can be retained up to 10 years, with yearly rules taking priority. If no other rules are set, the default retention rule applies. After the backup configuration is complete, a backup instance is created in the Backup vault. Use it to initiate restores, monitor activity, stop protection, and perform other backup operations. Azure Backup automatically runs scheduled backups jobs. These jobs run independently preventing disruptions during long-running tasks. Full backups are taken and remain in the vault per policy and are deleted once the retention period ends. Azure Backup allows restoring data from any recovery point within the retention period set by the backup policy. Recovery points are created when the PostgreSQL – flexible server is in protected state and can be used to restore until they expire as per the retention policy. Backups are restored as .sql files using native pg_restore command. This allows greater flexibility in restoring backup across different database versions. Pricing Information Azure Backup charges for protecting Azure Database for PostgreSQL flexible server are similar to other workloads. The customer is charged a Protected Instance (PI) fee and a backup storage fee. For more details, please refer to Azure Backup for PostgreSQL – flexible server pricing page. To get an estimate of the costs, you can use the Pricing Calculator for Azure Backup for PostgreSQL – flexible server. This tool allows you to input your specific requirements and provides a detailed breakdown of the costs associated. Getting started Configuring Vaulted Backups for PostgreSQL – flexible server is a straightforward process. You can use your preferred method to configure backups. Visit Business Continuity Centre in Azure portal to configure backups. You can follow the step-by-step guide here. Other ways to configure backup is via Azure CLI or PowerShell. You can also use Terraform template, ARM template or Bicep template.VPN on Windows Server 2016 not working
I followed the stand procedure to set up VPN on Windows Server 2016. Let me jump to where I am now. The event viewer has the following two entries when a client connects to the VPN server: A connection between the VPN server and the VPN client 72.74.70.135 has been established, but the VPN connection cannot be completed. The most common cause for this is that a firewall or router between the VPN server and the VPN client is not configured to allow Generic Routing Encapsulation (GRE) packets (protocol 47). CoId={23FC7BC4-0885-5E63-715B-8EFAD37B9E15}: The following error occurred in the Point to Point Protocol module on port: VPN2-127, UserName: <Unauthenticated User>. Negotiation timed out I am not familiar with GRE, so add rules for both inbound and outbound GRE on both the Windows Server 2016 and the client machine (Windows 11 Pro). Could anyone offer a direction to guide me in diagnosing this?137Views0likes11CommentsSustainable cloud journey: from on-premises to Azure optimization
Introduction: a new era of sustainable cloud transformation Sustainable innovations help organizations stay competitive, manage costs, and build resilience amid rapid shifts in markets and climate risk, all key drivers in moving to the cloud. Microsoft considers sustainability to be a core value, as demonstrated by our identification and implementation of opportunities to reduce energy, water, and waste consumption in datacenters; however, our impact doesn’t stop at our cloud operations. Microsoft Azure provides dedicated tools to help plan, manage, and continually optimize the carbon footprint of cloud workloads; from migration planning (comparing on-premises vs. cloud environmental impact) to ongoing management and optimization of Azure resources for minimal emissions. Today, we are excited to announce two new sustainability-focused capabilities that empower you to make this journey end-to-end: Sustainability in Azure Migrate business case – Now in public preview: Estimate and compare on-premises vs. Azure emissions when building your cloud migration strategy. Carbon optimization in Azure – Now generally available: Continuously monitor the carbon emissions of your Azure workloads with granular data and actionable recommendations to gain insights that help you reduce costs in the Azure portal. Let’s explore how they work across each phase of your cloud journey. Improve your sustainability profile through cloud migration Organizations analyze a variety of factors when considering their migration to the cloud, and potential improvements to their sustainability profiles are one of those factors. Azure Migrate, Microsoft’s migration and modernization platform, now incorporates sustainability considerations into migration planning. These new capabilities are designed for IT infrastructure teams, cloud architects, and project managers responsible for datacenter transformation. They’ll also aid financial decision-makers and sustainability leads who need to understand the environmental return on investment. By speaking the language of both cost savings and carbon savings, Azure Migrate gives insights to help build consensus across stakeholders evaluating how their move to Azure could benefit both the business and the planet. Here are key capabilities provided for sustainable migration planning: Application-Aware Assessments: Azure Migrate can assess entire multi-server applications as a single unit. This means you can identify and group all dependent servers, databases, and resources that make up an application and evaluate them together. Integrated Sustainability Metrics: The new sustainability assessment (in preview) in Azure Migrate business case adds carbon emissions analysis to your migration assessments. When sizing up an on-premises environment versus Azure you can now get estimates of the potential reduction in carbon footprint from migrating those workloads to Azure. (Note: This capability is being introduced in preview.) Enhanced Guidance and User Experience: A refreshed guided user experience walks users through the migration process step by step. The workflow is organized into clear stages, from creating your migration project, to assessment (with a choice of first- or third-party assessment tools per workload type), to migration execution. Optimize cloud emissions with carbon optimization in Azure Once you’ve migrated and your workloads are running in the cloud, continuous optimization is important for empowering you to maximize efficiency and minimize underutilized resources. This is where carbon optimization in Azure comes in. Now generally available, carbon optimization in Azure is accessible directly in the Azure Portal (simply search for “carbon optimization” in the portal). It surfaces emissions insights and recommendations right where you manage your cloud resources, making it easy for developers, IT professionals, and FinOps teams to incorporate sustainability into daily cloud operations. Here are key features that help you manage and reduce emissions: Granular emissions visibility: Carbon optimization in Azure offers detailed data on carbon emissions from your Azure usage, segmented by subscription, resource group, individual resource, type, and location. This transparency is essential for businesses incorporating IT emissions into their decisions. Built-in portal experience with RBAC: The carbon optimization dashboard is integrated into the Azure management portal, requiring no separate setup. It supports Role-Based Access Control (RBAC), ensuring that only authorized roles (Owner, Contributor, Reader, or Carbon Optimization Reader) can access emissions data for a subscription, safeguarding sensitive sustainability information. Data export and APIs: Carbon optimization allows exporting emissions metrics (to CSV) and accessing them via REST APIs for integration into reporting tools, dashboards, or sustainability systems. This enables holistic analysis by combining carbon and cost data. Optimization recommendations: Perhaps the most impactful feature, carbon optimization in Azure works in tandem with Azure Advisor to analyze resource utilization and provide AI-driven recommendations to reduce carbon emissions. The primary focus of these recommendations is eliminating underutilization. Companies can regularly review their dashboard to track progress towards sustainability targets just as they use other Azure dashboards to track cost, uptime, or security posture. This helps turn sustainability into a day-to-day practice, rather than a yearly audit. From efficient migration to continuous optimization Consider a sample scenario: An enterprise plans to migrate on-premises apps to Azure. They find that using Azure can cut carbon emissions by, say, 50% due to efficient infrastructure, adding to the migration proposal's appeal. The team migrates the apps, grouping components for efficiency. In the cloud, they activate carbon optimization in Azure. The carbon optimization dashboard shows some VMs are underutilized at night, prompting the team to right-size and schedule non-critical VMs to shut down during off-peak hours. This further reduces emissions and costs with minimal effort, tracked in the dashboard for visibility. Together, sustainability in Azure Migrate business case and carbon optimization in Azure create an end-to-end loop for sustainable cloud computing that ensures that sustainability isn’t a one-time consideration but a core part of your cloud journey. As you plan your next steps in the cloud: Evaluate your own cloud migration plans with an eye on sustainability. Work with your licensing team to use the new sustainability in Azure Migrate business case to see how moving to Azure could lower your environmental impact. Explore carbon optimization in Azure in the Azure Portal for your existing Azure resources. Even if you’ve been running in Azure for years, you might be surprised by the insights this tool provides. Get started in the Azure Portal today. Together, let’s make every cloud investment an investment in a sustainable future.Why Even Stateless AKS Clusters Might Need Backup
When we talk about backup solutions for Kubernetes clusters, the need is usually most obvious in the context of stateful workloads those managing persistent data like databases, file systems, or applications with long lived storage. For such workloads, backups are non-negotiable, acting as the safety net for data recovery, disaster protection, and business continuity. But what about stateless AKS clusters? At first glance, stateless workloads seem like perfect candidates to not back up. After all, by design: They don’t rely on persistent volumes or databases Their deployments can be recreated from container images, Helm charts, or Git repositories The infrastructure is defined declaratively using infrastructure-as-code (IaC) or GitOps pipelines CI/CD systems can tear down and recreate entire environments on demand In such idealized environments, backup appears redundant. Any failure or disaster can be handled by simply re-running the pipeline or restoring manifests from Git. If everything is version-controlled, reproducible, and ephemeral, why invest in backup infrastructure and introduce the complexity and cost of a backup solution? This line of thinking is valid in theory. In fully mature DevOps setups with no manual drift, strict Git hygiene, no compliance overhead, and zero state stored in-cluster, the cost and complexity of backups might outweigh their limited benefits. The answer lies in real world operational gaps, compliance requirements, and reliability expectations that engineering teams encounter, even in seemingly “stateless” environments. In this article, we'll explore the key scenarios where backing up stateless AKS clusters adds value, even in cloud native, DevOps driven organizations. When Stateless Doesn't Mean Disposable In organizations with well-established DevOps practices, GitOps and infrastructure as code pipelines manage everything declaratively. The entire cluster state from workloads to configurations is reproducible from Git. In such environments, backup may seem redundant. However, not every team is operating at peak DevOps maturity. Even in mature teams, gaps and exceptions exist. Let’s look at situations where backup for stateless clusters becomes essential. It’s important to note that most Kubernetes backup solutions, including Azure Backup for AKS, capture the desired state by backing up YAML manifests, Custom Resources, and selected cluster metadata while typically opting out of backing up the full etcd store, which is Kubernetes’ actual source of truth. As a result, a restore usually reflects what was intended to run, not necessarily what was actually running at the moment of failure. This distinction underscores the importance of regular, automated backups to bridge the gap between configuration intent and operational reality. 1. Absence or Inconsistency in GitOps Practices While GitOps aims to be the single source of truth, reality often diverges from this ideal: Manual changes are introduced in the heat of production incidents. Drift occurs between the declared and actual cluster state. Some components are not version controlled (e.g., in cluster secrets, ad hoc cron jobs, Helm releases with local values). Consider this as an example: A retail engineering team hot-fixed a frontend deployment by patching environment variables directly in the cluster during a Black Friday outage. They never committed the changes to Git, and after the dust settled, the service stopped working during redeployment. The backup captured the exact deployment spec from that moment—environment variables included—allowing the team to restore the working state without guesswork. In such cases, a backup acts as a point in time capture of the actual cluster state, allowing teams to recover or audit what truly ran in the environment not just what was intended as per the Git repositories. 2. Compliance and Regulatory Requirements Industries with strict governance—such as finance, healthcare, and government—often require: Auditability of production configurations Retention of infrastructure state for post mortem or regulatory reviews Proof of controls around cluster state Even for stateless workloads, backups can satisfy these compliance demands, especially when you need to demonstrate that critical workloads and configurations can be recovered exactly as they were. Consider this as an example: A fintech company undergoing a routine audit had to produce a record of its production deployments from the previous quarter. Their GitOps pipeline had undergone changes, and some manifests were overwritten. Fortunately, they were able to generate historical records from AKS backup snapshots, satisfying audit requirements and avoiding potential penalties. 3. Forensic Analysis and Post Incident Review When investigating a production outage or security incident, engineers often need to understand: What was deployed? What configmaps or secrets were present? Were any unusual workloads running? Consider this as an example: After a failed release, a security team launched an investigation and discovered that a crypto-mining container had been injected as a sidecar. It wasn’t in source control, and log retention had expired. A namespace-level backup taken a few hours before the incident provided a complete snapshot, enabling them to analyze the rogue container’s configuration and timeline. Backups offer a forensic lens into the cluster state at a specific point in time. This is invaluable for root cause analysis and may reveal insights that are no longer visible due to pod churn or log retention limits. 4. Accelerated Recovery and Operational Simplicity Even stateless applications can take time to recover in the event of: Cluster level failures Region outages Misconfigured redeployments In contrast to rehydrating from scratch, restoring a cluster or a namespace from a backup snapshot can significantly reduce RTO (Recovery Time Objective) especially for complex environments with multiple microservices, RBAC settings, custom resources, and interdependencies. Consider this as an example: An e-commerce platform hosted in a single AKS cluster experienced a regional outage. Though all apps were stateless and versioned in Git, restoring the entire namespace from a recent backup complete with network policies, secrets, and service bindings allowed them to relaunch in a new region within minutes, far faster than a redeployment pipeline alone would have allowed. 5. Preserving the Actual Source of Truth In many organizations, Git is not always the full source of truth: Onboarding of legacy applications with partial IaC coverage Teams using Helm or Kustomize without consistent repo structures Clusters with long lived manual tweaks In these cases, the cluster itself becomes the de facto source of truth. A backup ensures you can retain and reproduce what worked even if the corresponding manifests don’t exist (or are outdated). Consider this as an example: A SaaS provider relied on Helm charts with custom values set locally by different teams. During a platform upgrade, they lost track of multiple override configurations not committed to Git. The backup captured the running state, including those local Helm values, allowing for an exact rollback and informing future IaC improvements. Conclusion Stateless does not mean disposable and certainly not irrelevant in backup planning. While there are cases where backup for stateless AKS clusters can be safely skipped, many real-world environments face drift, compliance demands, or operational complexity that make backups not just helpful, but necessary. If your organization isn’t yet at peak GitOps hygiene or operates in regulated industries then backing up even your stateless workloads is a wise investment. Not because they hold data, but because they hold context and context is critical for resilience, auditability, and operational excellence. To get started with protecting both stateless and stateful workloads, explore Azure Backup for AKS and see how it fits into your Kubernetes resilience strategy.Tech Community Live: Windows edition
Deploying and managing Windows? Now available on-demand, four hours of Ask Microsoft Anything (AMA) sessions with our product and engineering teams! We're sharing the latest and greatest Windows capabilities at Microsoft Ignite so join us here on the Tech Community. Wherever you are in your journey to cloud native, whether you have a full Windows 11 ecosystem or aren't quite there yet, we will be here to help and support you with guidance, tips, and insights. Tech Community Live: Windows edition - now on demand! AMA: Windows security, chip to cloud AMA: Unified update management for Windows AMA: Hotpatching Windows - client and server Feedback wanted: Adopting new features with agility and control AMA: The future of AI with Windows 11 and Copilot+ PCs3.4KViews3likes3Comments