Auth and everything wrong, part 2

This is part two of my trilogy laying out what I think I understand about the current state of authorization and everything wrong.

Let’s look at the memes of application security.

GNAP and XYZ

The Grant Negotiation and Authorization Protocol is a purported foundation of XYZ. XYZ, which sums itself up as “transactional authorization,” is probably the closest thing we currently have to OAuth3.

Without getting into too many details, the spirit of transactional authorization is to authorize individual requests to an API based on dynamic scopes. Contrast this to OAuth2’s encouragement to authorize specific parts (usually permissions or roles) within an API, and OAuth1’s very coarse-grained, AuthN-like access (“I signed this request with the secret we agreed upon”).

The motivations behind GNAP and XYZ are noble, and while their homepage notes the proposal has been made “in the spirit of OAuth2,” implementers will be faced with fundamental incompatiblities, in particular polymorphic JSON and additional synchronization between the Resource Server and Authorization Server.

What’s worse than a JWT?

OAuth2 gives developers a bit too much implementation freedom perhaps, but XYZ goes one step further by mandating polymorphic JSON in the token format:

Polymorphic JSON. The protocol elements have different data types that convey additional contextual meaning, allowing us to avoid mutually exclusive protocol elements and design a more succinct and readable protocol. This lets us pass things by reference or by value using the same element field, among other things.

The intent behind this was good: why embed potentially sensitive data within a token when you can reference data with an identifier, assume the Resource Server and Authentication Server both understand the references, and anyone peeking at the token is left with mostly useless information?

Polymorphic JSON says instead of receiving:

{
  "customer": {
    "address": "123 Super Secure Blvd"
  }
}

You could potentially receive:

{
  "customer": "93f63fdb90d69b73db42b433549e5673c860c3ae"
}

Perhaps every attempt I’ve seen at replacing JSON (BSON, protocol buffers) has attempted to make it easier to parse, whereas polymorphic JSON is the exact opposite–it requires dynamic type checking for every field (counterpoint: safe JSON parsers should be doing that anyway) and here, it encourages mixing data types. You may get an identifier, an object, an array… who knows?

I was going to say this complicates things in statically or strongly typed languages, but honestly, I fear reviewing polymorphic JSON implementations regardless of language.

Synchronization Between the Resource Server and Authorization Server

XYZ specifically notes the following feature:

Ease of transition from OAuth 2. Even though this is not backwards compatible, there should be a clear translation path from OAuth 2 based systems to XYZ.

As someone who worked on an OAuth2 proxy, this seems difficult.

Let’s look at OAuth2.1’s protocol flow:

        +--------+                               +---------------+
        |        |--(1)- Authorization Request ->|   Resource    |
        |        |                               |     Owner     |
        |        |<-(2)-- Authorization Grant ---|               |
        |        |                               +---------------+
        |        |
        |        |                               +---------------+
        |        |--(3)-- Authorization Grant -->| Authorization |
        | Client |                               |     Server    |
        |        |<-(4)----- Access Token -------|               |
        |        |                               +---------------+
        |        |
        |        |                               +---------------+
        |        |--(5)----- Access Token ------>|    Resource   |
        |        |                               |     Server    |
        |        |<-(6)--- Protected Resource ---|               |
        +--------+                               +---------------+

Note that the client is potentially making a request to three different entities, which are all separated in this diagram for the purposes of request flow at authorization time.

And here is GNAP’s equivalent sequence:

        +------------+             +------------+
        | Requesting | ~ ~ ~ ~ ~ ~ |  Resource  |
        | Party (RQ) |             | Owner (RO) |
        +------------+             +------------+
            +                            +
            +                            +
           (A)                          (B)
            +                            +
            +                            +
        +--------+                       +       +------------+
        |Resource|--------------(1)------+------>|  Resource  |
        | Client |                       +       |   Server   |
        |  (RC)  |       +---------------+       |    (RS)    |
        |        |--(2)->| Authorization |       |            |
        |        |<-(3)--|     Server    |       |            |
        |        |       |      (AS)     |       |            |
        |        |--(4)->|               |       |            |
        |        |<-(5)--|               |       |            |
        |        |--------------(6)------------->|            |
        |        |       |               |<~(7)~~|            |
        |        |<-------------(8)------------->|            |
        |        |--(9)->|               |       |            |
        |        |<-(10)-|               |       |            |
        |        |--------------(11)------------>|            |
        |        |       |               |<~(12)~|            |
        |        |-(13)->|               |       |            |
        |        |       |               |       |            |
        +--------+       +---------------+       +------------+

    Legend
    + + + indicates a possible interaction with a human
    ----- indicates an interaction between protocol roles
    ~ ~ ~ indicates a potential equivalence or out-of-band communication between roles

The important distinction here is all the extra connections, particularly squiggly line for step 7:

   *  (7) The RS determines if the token is sufficient for the request
      by examining the token, potentially calling the AS (Section 10.1).
      Note that the RS could also examine the token directly, call an
      internal data store, execute a policy engine request, or any
      number of alternative methods for validating the token and its
      fitness for the request.

For a Resource Server tasked with authorizing a client request, OAuth2 implies it could do purely static authorization based on the token itself: “Does this token look signed and does it have the proper scopes inside?”

Meanwhile, a GNAP flow encourages the Resource Server call out to the Authorization Server or other policy engine to assert authorization: “This request just came in, based on everything you know at the moment, should this be authorized?”

For implementers of this protocol change, Step 7’s implies an extra hop for a Resource Server to dynamically authorize every request against a policy engine or Authorization Server directly. While XYZ claims a clear translation path is necessary, I believe this will be difficult.

Zero Trust

Once upon a time, a client was given a secret to sign every request to an API with, which proved who the client was. This was OAuth1, and in this way it sort of conflated authentication with authorization: so long as the secret was the same, the caller was trusted.

OAuth2 first authenticates a user, then has the user obtain application-specific authorizations. Where this authorization grant came from or went doesn’t really matter, so long as the application deems its contents satisfactory. Applications could reuse a grant to do things on behalf of a user, and thus authentication separated quite a bit from authorization.

Every company has their own set of tenets about what “Zero Trust” means to them.

A good Zero Trust implementation touches application security by blessing a new marriage between authentication and authorization. Every request must come from a trusted, authenticated source before being authorized. As trust changes for any client accessing a resource, it should be handled dynamically.

With its more granular authorization, inferred closer relationship between a resource and authorizer, and dynamic policy enforcer, Zero Trust and GNAP/XYZ have similar tenets. While GNAP/XYZ applies this to an application authorization protocol, Zero Trust creates a larger-scale network (and application) security foundation.

OPA

Suppose you’re looking to implement GNAP or Zero Trust, and you suddenly need an agent to enforce policy. Guaranteed, the first one you’ll find while Googling is Open Policy Agent.

While not necessarily a paradigm or protocol like OAuth, GNAP, XYZ, or Zero Trust, Open Policy Agent deserves a solid mention here for being a genius combination of:

Screenshot from https://www.openpolicyagent.org/

OPA, while technically open, is very much a paid solution. However, it remains perhaps the best implementation of dynamic policy enforcement.

What’s trendy

All these can piece together: GNAP/XYZ’s dynamic policy evaluation with every request–using something like OPA–fits within a Zero Trust-like model.

I find it interesting that these solutions have all been developing somewhat independently from the quest of more secure, granular access.

What’s frustrating is these solutions are all perhaps more difficult to implement than secure OAuth 2, and that they clearly hail from different aspects of software and maturity.

A Final Wishlist

As many lengthy blog posts established before mine, authorization is hard. For something as deceptively simple as establishing a protocol to return a boolean answer to “can I access this?”, we have some wild solutions.

Next time, given all of the above, together with my complaints and experience with OAuth2, I’ll take a stab at my wishlist for application security.