ORACLE LINUX CONTAINER UPDATE by Gilbert Standen, Principal Solutions Architect, Robin Systems.
Applications are the lifeblood of modern business. This puts undue burden on application developers and IT administrators to manage application lifecycle, deliver quality of service, and ensure user satisfaction. For example, a distributed application is typically deployed into individual components that must be manually provisioned and individually managed. Additionally, applications are deployed on dedicated servers or virtual machines just to ensure application QoS. Such forced alignment with artificial server, OS, and storage boundaries not only causes deployment delays and complexity, but also results in underutilized hardware and inflated operational costs. Limited application awareness at the infrastructure level makes it very difficult to deliver on SLAs. Also, tight coupling between the application and the underlying operating platform (OS or hypervisors) compromises application portability and developer productivity. Adoption of VMs for high-performance workloads has not been feasible in all use cases due to VM performance penalties of hypervisor-based elasticity and density solutions.
Now that Oracle database 12c in LXC Linux containers is fully supported on Oracle Linux UEK3 and higher (Oracle UEK4 was very recently released and is also supported for LXC containers) for Oracle Linux 6 and Oracle Linux 7, which is a relatively recent announcement, the door has been fully opened for the Linux container application-defined data center era to embrace Oracle databases as well as applications.
UNVEILING THE APPLICATION-DEFINED DATA CENTER ERA
Software and hardware vendors are introducing products for a new era in data-center evolution for Linux containerized application defined datacenters. Container-based, application-aware compute and storage platforms abstract away the underlying operating platform from applications by making the servers, virtual machines, and storage boundaries invisible. A containerization platform transforms commodity hardware into a compute, storage, and data continuum that enables:
- Applications sharing a server or machine without any loss of performance or predictability. Thanks to container-aware and application-centric storage technology, each application gets a guaranteed performance service level QoS all the way from application to storage spindle. This enables even the most critical enterprise applications such as databases and big data clusters to be consolidated with no performance compromise in Linux containers.
- Transparent application mobility across machines without data loss. By decoupling compute from storage, applications are protected from server failures, and, migration of Linux containers can be accomplished without moving or copying data, ensuring seamless data access for applications, no matter where the Linux containers are hosted.
- Fast and simple application deployment and lifecycle management. Leveraging container agility (they provision in minutes and boot from init because there is no virtual hardware) ensures that even the most complicated distributed applications such as Hadoop or NoSQL databases such as Cassandra and MongoDB, as well as Oracle DB and applications, can be deployed within a matter of minutes. Quick application clones can be created within seconds and application-level snapshots allow applications to go back and forth in time for test and dev purposes.
BARE METAL DEPLOYMENTS
Concerns about data privacy or cost constraints, while dealing with massive data volume, lead many organizations to favor on-premise bare-metals deployments for their data-driven distributed applications. The state of the art for big data applications, established by the flagship enterprises where these technologies were born, was to use bare metal and eschew virtualization. As these technologies are being democratized to the broader enterprise segment, bare metal remains in many cases the de-facto standard for applications requiring QoS guarantees. This leads to the following pain points:
- Cluster sprawl: To ensure performance isolation, each application is deployed on a dedicated physical cluster, which is typically over-provisioned to accommodate bursts of activity. For the same application, separate physical clusters are provisioned for development, test, QA and production environments. This leads to ballooning infrastructure and per-processor licensing costs, inefficient resource utilization, and increased management complexity.
- Unregulated data copies: Each physical cluster hosts its own copy of data. Apart from the obvious cost implications, data copy management raises its own unique set of challenges, including security, data governance and questions around the “single source of truth”. Development and test environments are often compromised internally with middle-management granting bootleg access in pressure situations. These environments also typically operate only on a subset of the production data, which leads to undesirable time-consuming activities such data-masking, data-stripping, and conflicts over the accuracy of such steps. There have even been cases of bare metal servers being repurposed outside the firewall without prior full data scrubbing and data theft occurring as a result.
- Poor operational agility: The bare metal approach couples application deployment cycles with hardware procurement cycles. When a new application needs to be deployed, the process includes procurement and installation of HW, followed by application installation, configuration, and tuning. This can add several months delay to the application release cycle, or even lead to cancellation of an entire rollout due to rigid enforcement of CAPEX constraints, hobbling the progress of the enterprise.
- Lack of scalability: When an application’s scalability exceeds the capacity of the existing hardware infrastructure, additional hardware must be procured, installed, configured and added to the existing physical infrastructure, leading to massive scalability delays, band-aids and budget turmoil requiring IT managers to “go back to the well” mid-year exposing themselves to planning criticisms from the C-suite.
VIRTUAL MACHINE DEPLOYMENTS – THE OLD ALTERNATIVE
Virtualization has delivered tremendous value to the enterprise over the past decade. Many organizations have consolidated their IT operations into private, public or hybrid clouds and rely on virtualization to serve the infrastructure needs of modern distributed applications. However, there are fundamental reasons why this approach has not found favor in the state of the art deployments of distributed data-driven applications in high-performance environments such as financial/banking and scientific computing.
- Jitter: Virtual environments use a best-effort approach to meet the resource requirements of an application. A high range of variability can be expected in application performance. A well-known blight for massively distributed applications is the “long-tail” phenomenon, where a small variation in the performance of one of the “n” identical tasks in a distributed application can lead to a large impact on response time. Virtual environments dramatically exacerbate the long-tail problem. In other words, if someone runs a Cartesian product query, processing can be ground to a halt, requiring development and operations staff to drop what they are doing and triage.
- Performance penalties: Highly-distributed applications that deal with large volumes of data are very sensitive to storage as well as network bandwidth and latency for performance. These are precisely the pathways that are hit the most by VM hypervisor-caused performance degradation. It is not unusual to experience a 20% impact on storage and network latencies when using a VM hypervisor. This can be a “deal-breaker” for deployment since performance is often a critical consideration for these applications that require steady high-performance and reliability.
AN APPLICATION-CENTRIC APPROACH
As innovation in the application space continues to accelerate, it is imperative to adopt a technology that understands applications. A platform is needed that not only liberates applications from the complex time-consuming deployment processes, but also ensures isolation of system resources among all running applications via built-in monitoring. With cloud adoption, the data center has become amorphous, scaling across private and public data centers. Applications now need to be portable and be able to move across data center boundaries with the click of a button.
LINUX CONTAINERIZATION PLATFORM – THE NEW ALTERNATIVE
Linux containerization provides an out-of-the-box fully-supported solution for hosting all the applications in an enterprise on a shared platform created out of commodity, special-purpose, or cloud components. Linux containerization can be deployed on bare metal or on virtual machines, allowing organizations to rapidly deploy multiple instances of their data-driven applications on premise or on cloud, without creating additional data copies, and, hardware vendors are currently rolling out Linux-container-aware storage features. A containerization platform consists of three components as described below.
CONTAINERIZED AGILE COMPUTE
Containerization is an efficient and lightweight operating system-based (OS) virtualization technology that allows creation of a compartmentalized and isolated application environment on top of the standard OS. While a hypervisor abstracts the virtualized OS from the underlying hardware, containers simply partition the OS and everything underneath, leading to simplified application deployment, seamless portability across hardware and operating platforms, and massive manageability improvements.
Linux containerization can benefit all types of enterprise applications, including highly performance-sensitive workloads such as databases and big data. Deployment technologies can pick the appropriate container configuration depending on the type of application. Traditional applications are deployed within “system containers” to provide VM-like semantics. Stateless micro services applications are typically already widely deployed in Docker containers.
Linux containerization on bare metal leads to the “zero-performance-impact” application consolidation of databases, Hadoop clusters, and other distributed applications such as elastic search, resulting in significant operational efficiency gains and cost reduction, as well as massive elasticity and density improvement.
CONTAINER-AWARE SCALE-OUT STORAGE
Traditional storage technology is not designed to handle the rigors of the containerized environment. Container-aware storage from the software-defined sector is designed from the ground-up to support agile sub-second volume creation, 100,000 plus variable sized volumes, and varied file systems and data protection needs of applications running within the containers. Hardware vendors are also starting to offer nascent container-aware storage options, but these options have not generally been built from the ground-up for containers, but are rather add-ons to embrace the growing number of customers running Linux containers. Nevertheless, combining these software-defined offerings with the new container-aware hardware offerings from storage vendors is sparking revolution in improved application datacenter design, efficiency, and performance standards.
Software-defined container storage solutions offer intelligent data protection that “understands” the data protection needs of application components to minimize overhead and maximize performance. For example, stateful applications such as databases are assigned “protected volumes” using replication or erasure-coding techniques, whereas the stateless components, such as webservers, are allocated standard volumes. By providing enterprise-grade data protection at the storage layer, such approaches obviate the need for inefficient application-level data protection schemes, such as the 3-way replication approaches used by Hadoop and other distributed applications. This results in storage savings of 50% or more and helps improve the write performance.
Application-driven data lifecycle and application-centric data management using Linux containers greatly simplifies application lifecycle management. Developers create quick application clones and snapshots for agile development and testing. Linux containers used with thin provisioning, compression, and copy-on-write technologies, ensure that clones and snapshots are created in seconds using relatively small amounts of additional storage. Even a 100TB database can be cloned in a snap using only a few GBs of additional storage.
USE CASES FOR LINUX CONTAINERS
IT infrastructure for enterprise applications such as databases and mission critical big data clusters suffers from high costs and poor agility/scalability. Virtualization has had limited penetration in this segment because of its performance overhead and inability to guarantee SLAs. Consequently, these applications are mostly deployed on dedicated physical clusters, leading to cluster sprawl, low hardware utilization, over-provisioning for peak demand, long lead time to scale/deploy clusters, and the need for petabytes of copied data for each cluster. These issues also apply to cloud deployments where dedicated resources and high performance storage translates to higher cost. Some use cases are shown below.
Consolidation of data applications on bare metal is feasible with Linux containers because there is no overhead from running a hypervisor and emulated hardware or virtualized guests. This means more applications packed per server for reductions in both CAPEX and OPEX. Disparate applications starting from traditional RDBMS or the modern distributed applications such as NoSQL, Hadoop, and elastic search can all share the same physical node while running completely isolated within containers.
Self-service rapid DevOps provisioning eliminates the need to have a separate software stack for each individual team or for each phase of development, and helps to reduce the need for large separate storage pools that arise due to data duplication. Linux containerization enables efficient collaboration between software developers and IT professionals with the decoupling of compute and storage, shared data, and rapid provisioning of applications.
Provisioning of complex big data applications with the push of a button is enabled in just minutes. Once provisioned, Linux containers can share data across clusters and this facilitates elasticity to grow or shrink the cluster based on workload requirements.
Elastic, agile, high-performance foundations for data-driven applications. Linux containers deployed on bare metal, or on virtual machines, allow organizations to rapidly deploy multiple instances of their data-driven applications in just minutes, on-premises or on-cloud, without creating additional data copies
LINUX CONTAINERS AND DBaaS
Database-as-a-Service (DBaaS) is an operational approach that enables IT providers to deliver database functionality as a service to one or more consumers. With DBaaS, organizations can standardize and optimize on a platform. This eliminates the need to deploy, manage and support dedicated database hardware and software for multiple development, testing, production, and failover environments, for each application.
Benefits for DBaaS
- Consolidation and standardization of applications and the configuration on a single platform » Consumer-driven self-service rapid provisioning and lifecycle management
- Predictable App-to-Spindle performance
Linux containerization provides a platform that is ideal for creating an enterprise ready private DBaaS that conforms to these basic tenets of very rapid rpm-package-like deployment.
Recent Linux containerization software is now rolling out with management UI’s for containers where before there was generally only CLI options. The advent of UI management of Linux container deployments provides the single-pane-of-glass management which users of VM deployments have come to expect. Deployment of Linux containers via highly-function-rich UI applications means that all the operational efficiencies of Linux containers are now available with a look and feel IT departments are comfortable with. The GUI availability now enables single-click management and deployment tools, and simplifies time-saving database lifecycle management tasks such as cloning and database snapshots, similar to what has been offered previously only through CLI tools, while of course continuing to offer CLI options so that organizations can automate and script deployments.
Modern data-driven distributed applications need a new kind of infrastructure. An infrastructure that makes machine boundaries functionally invisible, that “understands” the application topology along with the data and is capable of managing data lifecycle, and that enables successful multi-tenancy of clustered applications to analyze this data. Linux containerization is the “new stack” which has been designed to meet these requirements while preserving bare-metal performance.