Authentication Flows Explained
This video is number two of the Zero Trust Authentication Master Class series.
Transcription
So, my name is Jasson Casey. I am the CTO at Beyond Identity. And now, we're going to descend into some of the backstory or really the basics of how does authentication and access work today.
And then we'll get into some of the problems. What are the problems with access and authentication today? We're going to keep referring back to our Zero Trust framework that we established a little bit earlier as a way of kind of setting a metric or an objective. Like, what's the standard that we want to achieve in our access and authentication?
But before we could do any of that, we kind of need to develop some tools to be able to kind of analyze different scenarios. And the primary tool that we're going to develop is this thing called a sequence diagram. Some of you have probably heard of this and know exactly what I'm talking about. Maybe you want to skip through this. A lot of you probably have never really seen this, or have seen it but don't really understand how it works.
So I'm going to try and give you a basic explanation. We're going to go to this very, very simple login flow. Everyone's logged into services before, so we all understand how that works, right? So, let's say I'm at a computer and I launched my browser, right? And I say, "Browser, I want to go to bank.com, right?" And so my computer is going to reach out to bank.com.
And bank.com's going to serve me. So, we'll call this step one. Bank.com is going to serve me a page. And this is all, again, something that everyone understands, right? So I'm at bank.com.
And I've been served this page. And on this page, I might have two little boxes where one says log in, and the other says password. And there's a little button usually saying, log on or log in the bank. They've got money. Money. Okay.
So, step one, put in the browser where I want to go. Step two, the bank basically sends me out a page saying, "I want to collect some authentication material for you." So I will then type in my username and my password. And so what happens next is I will supply that information, right? Step four, oops.
Back up to the bank. So, again, typically, you see your box, you've typed in your name. Let's say my name is Alice, because you can't start a name without Alice in cryptography.
I almost said crypto, but crypto now means something else. Right. Typically, I can't see the password I'm typing in. So, oh, wow, this says step four. It should say step three. Okay. So I go to the browser, I put my URL in, or the name I want to go to, the bank serves back a page.
I then put my credentials in that page. It supplies those credentials to the bank. And then the bank will kind of try and verify those credentials, right? And that's actually step four. And the verification can basically end up being a yes or a no. But let's assume it's a yes. Right.
And so, typically, at this point, you will go from what's called an untrusted session to a trusted session. And it will maybe even kind of greet you, right? Welcome, Alice. Here's how much you have all the monies in your bank account, right?
Okay. So, what's the flow look like? The flow looks like I want to access the service. The service starts this untrusted session and says, "I need you to prove the identity of who you are in this moment." So, I'll go through a basic proof setup, i.e. username and password. And then it will decide if I am, in fact, that person or not.
And in this scenario, we decided the affirmative. So, now, let's look at this in a slightly different way. So this is what we call a sequence diagram. So I have Alice represented on this line. And then over here, I have the bank represented on this line. And so the first thing that happens is Alice will reach out to the bank saying, "I want to access this page."
Right? So, again, time flows in this direction. So if you see arrows in this direction, you know it's later than things that occurred higher up. And I typically like to get my arrows a little down just to represent that network transit is not free, unfortunately. And then one other thing that we'll talk about is, we'll make it a little bit more real. When I'm accessing bank.com, I'm actually using HTTP.
So I'm going to go ahead and use a little bit of HTTP jargon, right? So a typical webpage is a get, and I'm getting a resource at bank.com, right? So it does a little bit of processing and serves me the landing page. And usually, that response is what's called 200, okay? And that's where I get that kind of page that you saw a minute ago, basically asking for my username and my password, right?
If we wanted to label this, this was step one, this was step two. All right. So at this moment in time, this display is now showing to Alice. And Alice is going to enter a username and a password. So, this, typically, is not going to be a get, it's going to be a post.
And I'll tell you a little bit more about that in a second. But in the body of that post is going to be the username and the password. And then the bank is going to verify the username and the password. And again, we kind of assume the happy path.
So let's just assume that these are valid credentials, right? So, at this point, once they're verified, bank's also then going to look up, you know, details about Alice, right? So it can then show Alice kind of an authenticated view, right? So maybe we're telling her how much money, she has all the monies, remember how much money she actually has in her bank account.
So step three and step four. So very rudimentary scenario. Lots of people like to draw these topologies and then overlay flow. But it's kind of hard to analyze certain situations when you draw things like this. So I prefer to talk about sequence diagrams. So I'm going to use a lot of sequence diagrams coming up.
And the basics of a sequence diagram really is time flows down each line or it represents a participant. And using the language that we talked about earlier, you can think of each line represents a resource, right? This is our server resource. This is our endpoint client or asset client. It's a little bit more complicated. Technically, Alice is an agent interacting with the machine, but for now, that's totally fine. And we're going to get a little bit more specific and talk about some of the protocols that are actually happening back and forth, right?
I said typically it's a post, not a get. And the reason for that is in real life you almost never talk directly to the bank servers, you almost always go through intermediate proxies, load balancers that are probably not run by bank.com. They might be run by Cloudflare or Amazon's version of Cloudflare product. And remember earlier we were talking about one of the problems that Zero Trust sets up is you may be operating on infrastructure that is not yours.
You being an enterprise. That is a perfect example of these requests going through infrastructure that is not the banks. And as we all know, when you go through web infrastructure like proxies, the request URI or the information that's contained in the first line of the message actually typically shows up in logs. So, the reason for using a post versus a get is so you can shift a lot of information at the body, keep that stuff out of the logs, but even then, you're still operating in a way where now this third party has visibility to your service.
And while we trust that third party to operate in good service or in good faith, you know, insider threat is the very definition of even though we trust people in good faith they don't always operate in good faith at 100%. So as we've been describing earlier, we have this very basic authentication sequence. And over here, we're kind of showing a little representation as a sequence diagram.
But there's a couple of things I wanted to highlight about the authentication itself. So the way the authentication pattern actually works is bank.com, it's verifying that this username and password is correct, right? Now, the interesting thing to observe is the only thing that really is identifying Alice to bank.com is this username and password, right?
So that's kind of a critical piece of data, right? So, let's say our critical data in this scenario is my username and my password. Now, obviously, I know we're making a naïve argument right now and we can strengthen it, but really all of the strengthening of the argument doesn't change much.
But I will show that to you in a minute. So, critical data, username, and password. Now, as we all know, when we think about data, we have to think about protecting that data in what's called in motion and at rest, right? So, I'm sure you've all heard this before, in motion and at rest. Basically means if the username and password is the only thing that identifies Alice to bank.com, then whoever has that username and password is going to be Alice as far as bank.com is concerned.
So protecting this username and password and making sure it's only ever available to Alice is actually a very important task. So, how hard is that task? Well, we know we can define the surface area of the problem, right, or the attack surface by really kind of analyzing the motion of the data like how is it being protected when it's in motion?
And how is it being protected when it's at rest, right? It lives in Alice's head. We're not going to do much about Alice or her head. But it also lives right over here in Alice's machine when Alice actually punches the username and password into her machine, right?
So the question from an at-rest perspective is, how am I protecting the username and password on Alice's machine? It clearly lives somewhere in bank.com, right? Otherwise, it could not complete this step. So there's an obvious question of how do we actually secure that inside of bank.com? And also, it exists in motion as we convey it to bank.com.
So how are we protecting that username and password across that connection? And it may seem kind of simple. In the back of your mind, you may be thinking, "Well, I use TLS here and it's private, therefore no one can see." And if that was the situation, you would be correct, except for the fact that that's almost never the situation, right?
This is almost never end-to-end TLS, it's almost always re-terminated at some sort of network edge, right? So then how do I protect it on that re-termination infrastructure? Is that re-termination infrastructure even mine? The point I'm really trying to make is when we look at the vulnerability created by username and password, and we analyze the surface area of how to protect the username and password, it's not just in the service provider, it's also in every machine the user ever uses as well as all of the infrastructure necessary to connect these two during their transaction.
So, the point I'm really making is the surface area is incredibly large. And if you can't tell, I'm starting to motivate a problem of, is there a way of coming up with an authentication protocol that doesn't involve verifying a shared secret? Because if there was, we could actually at least remove one of these dimensions of the surface area that we have to worry about protecting.