I know what I’d be thinking if I were you, “what, another OIDC explanation, why?”. There is so much written on OIDC, much of which makes it seem way more complicated to me than it is. So I’m writing a series that explains it in a way I think will make more sense to Ops/SREs. In my previous post, I started out planning to layout an entire recipe for setting up an automated Crossplane cluster tenant provisioning process. Including automated OIDC configuration so that only the proper identities could access them Then I pumped the brakes.
As I began writing it up, I realized I was going to be doing more of a disservice than not. This is because it has taken me years of experience to understand all of the moving parts involved. And just writing out the steps in copy/paste format would most likely lead to enabling people to implement something without actually understanding it. My blog has always been intended to help people understand, and then be enabled.
So, I intend to do that beginning here. In this next series of posts, I will walk through various pieces of the overall solution to provide an understanding that should enable you to both understand and build it. (I won’ cover Crossplane as the end result, that can be deduced with Crossplane understanding).
This will require a lot of posts, as it’s covering topics that have taken me years to build an understanding. I’m going to begin with the building blocks of OIDC.
OIDC Building Blocks – JSON Web Tokens
In a few previous posts, I touched on some topics of TLS. I talked about public/private key pairs and asymmetric encryption. Key pair/asymmetric encryption is the base building block of nearly all authenticated encryption.
With OIDC, we’ll see that something called JSON Web Tokens (JWT) are used. Key pair authentication and encryption is a ‘key’ part of this component of OIDC. So rather than beginning with an overview of every OIDC component, I’ll begin with just an overview of JWTs and how they work in general.
Quick (super simplified) overview of asymmetric encryption. Asymmetric encryption is where two encryption keys exist, inextricably related to one another. Let’s call them key A and key B for now. If I encrypt a value with key A, it can only be decrypted with key B, and vice versa. There is no possible key C that could decrypt anything encrypted with key A or key B.
Then we have hashing algorithms. A hash is not encryption, it is a one-way function (i.e. Once we ‘scramble’ a value by hashing it, there is no way to unscramble it). When a hash of a value is sent along with the original value, we can use the same algorithm to create a hash of the original, compare it to the hash that was sent, and know if the original value has arrived exactly as the original value was when it was sent.
Sidenote: Hash algorithms are a key component of password based security. Ever wonder why your company keeps making you create longer and more random passwords? While hashes are supposed to be one-way, compute power increases every year, and it becomes less expensive to reverse engineer a hashed value to its original. So hash algorithms evolve (e,g, LM hash -> NTLM hash -> NTLMv2 hash), and best practices evolve to create more difficult/expensive passwords to crack. Ultimately, passwords alone are no longer considered sufficient. This is why we now have multi-factor auth, one-time tokens, and so on. This is more of an issue when we’re using a hash of the same value (password) for long terms. In the case of hashing one-time and short-lived messages, it is of little to no concern.
In public/private key pair encryption/authentication, I keep one of the keys known only to me. I freely distribute the other key to anyone that needs to encrypt data meant only for me. I think the private and public key are obvious from that.
But there is also a notion of ‘authentication’ (i.e. verification of who I am) enabled here as well. This is accomplished by me sending you a message ‘signed’ with my private key. I create this signature by sending you some text that I have hashed, and then encrypted that hash with my private hey. You use my public key to decrypt the hash, hash the text I sent you, compare the two hashes. If they match, you know that I possess the private key. And since only I should possess the private key, it must be me.
The final piece of the asymmetric encryption/authentication puzzle to cover is certificates. How do you know my pubic key in the first place? I give you a certificate that says this is my public key (it contains my public key). How do you know it’s me providing you that certificate? My certificate is signed by another party that you already trust. That party is known as a root level Certificate Authority (CA). Your applications and services are configured with certificates of trusted CAs. My certificate includes my public key, plus metadata conveying what my certificate is used for, and a signature of that data. The signature is a hashed value of the clear text data, that has been encrypted with the CAs private key.
So the CA is the top of the trust chain. If a root CA is compromised, we have chaos. This is where PKI comes in, out of scope for this post.
Ok, we’ll talk about asymmetric encryption and hashing as we go through JWTs. With that groundwork laid, we can get into JWTs.
How the RFC describes a JWT:
JSON Web Token (JWT) is a compact, URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is used as the payload of a JSON Web Signature (JWS) structure or as the plaintext of a JSON Web Encryption (JWE) structure, enabling the claims to be digitally signed or integrity protected with a Message Authentication Code (MAC) and/or encrypted.
To elaborate on that relatively vague description, a JWT is JSON formatted data, that is signed, potentially encrypted, and included as part of an HTTP/S URL. Essentially data that can be added to a URL string along with data integrity, authentication, and encryption capabilities. This includes the use of the certificate concepts discussed above. You’ll also notice the mention of JWS and JWE. This indicates to us that there are sub-categories of JWTs. A JWS is a JWT that is signed but not encrypted. A JWE is a JWT that is encrypted. The recommendation for JWS is to create a JWS, and then encrypt it (essentially wrap it) in a JWE.
The basic structure of a JWS is a header describing the signing algorithm, the JSON payload, and the signature. In a URL, it looks like header.payload.signature. Deeper understanding of the structure and inner-workings aren’t necessary to understand JWTs for my intended purpose of this series. Lots of info out there if you’d like to go deeper though.
Long story short, JWTs enable us to send structured data in a URL string, that we can sign for data integrity and authentication, and encrypt for privacy.
In the case of OIDC, we use JWTs as the primary data sharing mechanism between the entities involved in establishing authorization and authentication.
In my next post, I’ll cover what OIDC is, what OAuth 2.0 is, the defined components of an ‘authorization code flow’ (including JWTs), and how they work.