Securing Microservices with Open Policy Agent

6 min readDec 26, 2018

Just before the holiday vacations started and the offices evacuated, I was in search of a solution to my problem. I want to build a multi-tenant cluster that runs services deployed by many different teams, and I want them all to be able to secure their own resources in a nice, clean, highly resilient fashion. Securing these things can’t slow them down or decrease stability.

The big problem my brain was chewing on was how to enable different teams to enforce security without creating an artificial governance bottleneck as an impediment to rapid deployment.

In my exploration, I’d come up with an interesting idea for how to store the list of permissions in triples inspired by the subject-predicate-object form of RDF statements. For example, user alice can read resource account-1234. Then, a service would assert that a user needs such a permission to protect its resources. RDF is easy to parse, so I figured it was at least a good starting point to explore solutions.

While searching for conventions and standards on this project, the Open Policy Agent (OPA) showed up halfway down the page in one of my search results. About an hour and a half later, I had jumped into that rabbit hole and was loving what I saw.

OPA’s premise is that it gives you the ability to define fine-grained policy control at all levels of the stack, and, maybe more importantly, decouple the definition of policy from the enforcement of it.

Think of it like a purpose-built JSON document store. At the foundation level are raw JSON data documents, where you can define whatever you like from users to permission grants to servers or pub/sub topics — anything. You then write declarative rules that generate new, derived data from the foundational data. The output of these rules is usually designed to support the creation of policies. Policies are also documents containing declarations in the Rego language, and these are what you query when you use the Open Policy Agent.

Now, OPA has a huge number of uses, including being used for an admission controller in Kubernetes, and it has dozens of other integrations including Docker, but those are other people’s problems (for now), and I just want to solve my application security issues.

Let’s say that I’ve got some raw data similar to my RDF triples, a list of permission grants of users to restricted resources. In OPA, a data document might look like this:

[
  {
    "mode": [
      "read",
      "write"
    ],
    "resource": "accounts/123",
    "to": "albert"
  },
  {
    "mode": [
      "read"
    ],
    "resource": "accounts/345",
    "to": "albert"
  },
  {
    "mode": [
      "read"
    ],
    "resource": "accounts/123",
    "to": "bob"
  }
]

In this example, Albert has been granted access to 2 resources and Bob to one. The first thing I want is to be able to write a policy that my account service can check to see if a user should be allowed access.

That policy might look like this:

package demo.authzimport data.grants # access to the raw data
import input as http_apidefault read = falseread = {
  grants[i].to = jwt_token.payload.user
  grants[i].mode[_] = "read"
  grants[i].resource = http_api.path
}

This syntax might look a little strange. I’ve created a policy called read in the package demo.authz. This policy is a set of statements that must all be true in order for the result to be true. So, in this case, the username on the incoming JWT token (OPA supports tokens natively!) has to be the one that received a grant where the mode array contains a "read" value and the resource property of the grant matches the path variable on the input to the policy query. I would then just issue a query via OPA’s HTTP API for demo.authz.read, supplying the necessary input, and expect a boolean reply. I could also use rules and policy to filter the list of resources available to a given user.

Hopefully you’re as excited about the use of this Rego language and the merging of data, rules, and policies as I am. This gives us way more power and flexibility than traditional RBAC implementations.

I could get fancier here and write more strict rules for data modification. For example, I could make a policy that prevents transferring money between accounts if you haven’t entered an MFA token within the last 15 minutes, even if you do have the permission grant.

This is the kind of power and flexibility I was looking for. I started thinking about how I was going to manage the data documents (e.g. the list of grants), the rules (applying higher-level meaning to the raw data), and the policies (tenant-written security declarations). I could just maintain the text files in a private GitHub repository or I could store them as secrets in Kubernetes, but OPA has an even more powerful option — a pluggable control plane.

At this point I could feel the dopamine dripping in my brain and I felt compelled to learn a lot more about the Open Policy Agent and its inner workings. I’m always skeptical of systems that can have multiple sources of (potentially conflicting) truth, but OPA wasn’t designed that way. You can either issue HTTP requests to load data, rules, and policies, or you can configure OPA to download them periodically from a bundle server, which essentially makes all those JSON documents available in a structured tarball. It’s even etag aware so it won’t re-download the same archive.

So if I already have my own internal source of truth, I just expose a little bundler service and now every OPA instance in my cluster (you can run sidecar if you like) will have synchronized data.

At this point I’m thinking, “that’s great, but there’s no way I’ll be able to make this thing compliant, especially for financials.” OPA’s creators thought of that as well.

You can configure OPA to send a full log of decisions (including strict versions of the bundle, policy, etc used to inform the decision) to a service you stand up that can then securely record this information for compliance. This is asynchronous so log shipping won’t slow down policy evaluations.

In short, my googling for a convenient way to store permission grant statements didn’t just turn up a solution to one problem, it turned up a solution to a ton of problems I hadn’t even gotten around to solving yet.

This ticked all of my boxes:

Fast and reliable — I can use it without impacting the SLA of my services. You can run it centrally in a cluster or run it as a sidecar next to each secured service.
Pluggable control plane — download versioned updates from a source of truth, allowing change in data and policy w/no downtime.
Compliance — automated submission of version-tagged decision logs to a service.
Easy to use Rego language for authoring, reading, and socializing policies.

In addition to all of these goodies, OPA also has a REPL you can use to run, test, and design your policies and data structures. It has a complete set of unit testing tools so that we can have confidence in our policies and ensure that policies can be a build-breaking part of a CI pipeline. It even has a VSCode plugin that lets you highlight and evaluate rules and query policies right within the IDE.

I’ve still got a lot to learn and a lot more exploring to do, but from what I’ve seen so far, Open Policy Agent seems like an ideal way to take care of a number of authorization issues in an elegant, cloud-native fashion.

Securing Microservices with Open Policy Agent

Written by Kevin Hoffman

Responses (5)