The Red Hat Universal Base Image (UBI) has an end user license agreement which allows partners, customers and community members to deploy it anywhere, but it takes a lot more than a license to create a container base image that’s suitable for your enterprise applications. In part, suitability for enterprise deployments comes from the compatibility guarantees of a Linux operating system. No Linux container base image can claim compatibility or supportability everywhere. Compatibility must be engineered into a system like OpenShift, from Kubernetes down to the Linux kernel on the container host.
People often confuse portability with compatibility. Linux containers are generally considered “portable” because you can often run binaries built for one Linux distribution on another distribution of the same architecture. It’s often possible to run containers built from one distribution’s userland on another Linux distribution.. This can be described as portability.
Portability is a design characteristic of operating systems and the filesystems that they use to store files. Engineers have to design this portability into file systems that they work on, it’s not free. But, portability is not the same thing as compatibility.
Just because you can copy a binary file between operating systems, doesn’t mean that it will execute correctly. You can copy a binary (for example, one compiled from C/C++) from Fedora on ARM to Fedora and x86 using rsync, but it won’t run. That’s not controversial, we all know that ARM binaries can’t run on x86 machines, or that Windows binaries can’t run on Linux. The file is portable, but it’s not compatible.
But, what happens when you rsync a binary from two very similar operating systems on the exact same physical architecture? What happens when, instead of using rsync as the technology for copying a file, we instead use a container image? There’s no real difference between using rsync or a container image, it’s still copying binaries between Linux distributions. Rsync copies the file but does not offer any tooling to guarantee compatibility. Neither do container engines like Podman, CRI-O, Docker, or containerd. Container engines, like a Bash shell, just hand the binary off to the Linux kernel to execute.
This is where some really smart people get led astray. They are often convinced that this ought to work (in the most philosophical of senses). It’s even more confusing because parts of the Linux operating system are designed to hide compatibility problems. The Linux kernel and glibc put a lot of effort into hiding these details from the end user, especially when upgrading versions. And, for the most part this works – until it doesn’t.
To better understand the work that Red Hat does to engineer compatibility between versions of UBI, CRI-O, Podman, and the Linux kernel in Red Hat Enterprise Linux (RHEL) underneath Red Hat OpenShift, we have to take a look at the technical roots of UBI which go all the way back to launches of RHEL 6 & 7.
Red Hat Enterprise Linux 6 & 7
Red Hat released its first container base image back in 2014 based on RHEL 6.5. Even with this very first release, Red Hat engineers discovered and fixed compatibility problems. To better understand, let’s review a timeline with RHEL 6 and containers. For reference, we will use the RHEL 6 release notes and RHEL6 Wikipedia page.
Timeline
- Inside Red Hat, the RHEL 6 branch is forked from Fedora 12 with backports from Fedora 13 and Fedora 14.
- May 2010, the first bug is filed for the RHEL 6 branch (592450).
- Nov 2010, just fewer than 5,000 bugs have been fixed in the RHEL 6 branch (Bugzilla).
- Nov 2010, RHEL 6 is released to the public (release notes).
- Mar 2013, the first Docker container engine is launched upstream (Docker 0.1.1).
- May 2014, a bug is discovered preventing RHEL 6 container images from running on RHEL 7 (Bugzilla).
- May 2014, Red Hat engineering updated container engine, thereby fixing Fedora 20, but this did not resolve the issue with RHEL6 container images (docker 0.11 build).
- May 2014, Red Hat engineering backports libselinux container handling from RHEL 7 to RHEL 6, thereby resolving the issue (changelog – is_selinux_enabled).
- Jun 2014, RHEL 7 released to the public with with Docker 0.10 (release notes).
- Jun 2014, the first RHEL 7 base image is released (RHEL 7.0-21).
- Aug 2014, the first RHEL 6 base image is released (RHEL 6.5-11).
The order of operations here is critical. Red Hat released RHEL 7 as a container native operating system with the Docker container engine and pre-built container images. Red Hat then released a supported RHEL 6 container image two months later. This eased the migration of applications from RHEL 6 to RHEL 7 by enabling them to run in containers. This was seminal compatibility work which enabled running older container images (RHEL 6) on a new container host (RHEL 7).
This timeline above shows some important things with regard to compatibility bugs. They can be very difficult to troubleshoot and can have lots of misleading causes and resolutions:
- Can be a multi-part problem (container engine and container image).
- Version of the container engine used on the container host.
- Problems with the underlying libraries in the container image.
- Require backporting code (only resolved with code from newer library, container engine, etc).
- Can involve regressions (used to work before the newer version).
- Some CVEs only show up when running newer image on older host.
- Some CVEs only show up when running older image on newer host.
- Can cause binaries to run runs slower/faster than before.
- Can create discrete faults (container won’t start, binaries won’t run).
Demonstrating The Problem
The challenge is, RHEL 6 has already been fixed. Furthermore the fix was back ported to all of the RHEL 6.X container images. This means the bug cannot be reproduced by using RHEL 6 container images. Luckily, this bug is very easy to reproduce with Fedora 14 which was the preliminary source for RHEL 6. Using a Fedora 14 image with this older libselinux library, the bug is 100% reproducible.
Though Fedora 14 container images are no longer available publicly, they are easy to build by installing a virtual machine, exporting the root file system, and importing it into a container engine like Podman. For this demo, I have already created a Fedora 14 image and pushed it to Quay.io so that you can test it yourself.
This is but one of many bugs like this that have been resolved, but it’s particularly demonstrative because it shows the genesis of a bug between Linux versions.
This example will demonstrate the problems that can crop up when running an older container image on a newer container host. This problem manifested itself when trying to install software. Many packages like httpd try to add a user during installation. This bug makes it impossible to add a user and made software like httpd impossible to install:
First, we demonstrate the failure with Fedora 14 on a system running RHEL 7:
[root@rhel7 containers]# podman run -it quay.io/fatherlinux/fedora14 bash [root@daf7e19861e4 /]# useradd fred useradd: failure while writing changes to /etc/passwd
Now, we can demonstrate the backported fix with RHEL 6:
[root@rhel7 containers]# podman run -it rhel6 bash [root@1df4c3dc3937 /]# useradd fred [root@1df4c3dc3937 /]#
This bug demonstrates a few things. First, there is a clear connection between the container image and the container host. Second it demonstrates Red Hat Engineering’s ability to find, troubleshoot, repair, and backport fixes for novel compatibility problems. Finally, it highlights the way Red Hat is thinking about the problem over the lifecycle of an enterprise Linux distribution.
Conclusion
To recap, compatibility, portability, and supportability are separate capabilities. All three are important if reliability is the ultimate goal. A CI/CD build of your container images my help you discover when a problem like this crops up, but it will not help you resolve it. To resolve a problem like this, you must have technical experts who can (in this case):
- Discover a compatibility problem
- Resolve it by updating libselinux
- Release a fix in RHEL 7
- Back port the fix to all of the RHEL6 images 6.5+
By resolving this bug Red Hat demonstrated the ability to do all of the above and preserve ABI/API compatibility over the long lifecycle of RHEL major releases, as well as extending support across major versions.
Containers start with Linux. This bug and its resolution highlight that container support starts with Linux support. With the Red Hat Universal Base Image, developers can build and distribute containerized applications anywhere they like, but should problems arise, the risk can be shifted to Red Hat by deploying on OpenShift or RHEL.
Originally posted at: https://www.redhat.com/en/blog/limits-compatibility-and-supportability-containers