Introducing EGF2: Scalable Graph Oriented Backend API Framework

We are proud to announce that we’ve open sourced EGF2 — a lightweight Node.js framework inspired by the Flow Health platform architecture for building complex distributed systems.

We’d like to try a bit of an experiment here at Flow Health. Instead of building a project internally, deploying it into production, and waiting until we think it’s fully polished to open source; we decided to “throw it over the wall” before we’ve even deployed it for production use. This is not to say that others haven’t already built complex systems and put them into production.

With EGF2, you can build production ready scalable backends in just days; not months or even years. EGF2 consists of a collection of stateless microservices, each of which can be scaled independently, and is built on top of widely adopted open source technologies like Apache Cassandra (or RethinkDB), Apache Kafka (or AWS Kinesis), Redis and ElasticSearch.

The Flow Health platform is built primarily with Golang, Scala and a few services written in Java. So why did we build EGF2 in Node.js? Last year, part of our development team wanted to try something different and simplify aspects of our architecture. Instead of working from our extensive codebase, a small team took it upon themselves to engineer a framework using Node.js, a widely used programming language for start-ups.  

Less total work than open sourcing internal software already in widespread use

Extracting an internal project so that it can be open sourced often requires untangling the code from internal dependencies, repackaging it, separating out any company specific business logic and configuration settings, reorganizing the documentation, and, often, writing a bunch more documentation. Once the software is in widespread use internally, doing all this can be disruptive and can even require painful migrations of internal software. The effort to open source an internal project is often worthwhile, if the project is open sourced from the start. So let me tell you more about EGF2.

EGF2 is comprised of a collection of services (I could probably call them microservices even though while working on them we were not consciously following the microservices paradigm). I will address each of them in turn below, but before we get there I would like to clarify a couple of things.

EGF2 provides graph oriented API. The way API endpoints are structured is very similar to Facebook’s Graph API. We felt strongly that graph API is beneficial for projects with broad and non trivial data models, heavy on relations between entities. Graph API allows us to declare a bunch of endpoints in the beginning and never really add new ones, no matter what we do with the data model (expand and modify it, that is). And we have to admit, we liked what Facebook was doing with their Graph API.

EGF2 is not a web framework, like Ruby on Rails or Python Django. It is a pure API backend framework built with scalability and fault tolerance in mind. It includes client side libraries for iOS, Android and HTML5 (Coming Soon!).

Why open source EGF2 — and why now?

We recently announced our Medical Knowledge Graph to inform medical decision making. We aim to embed this knowledge graph into wide-ranging clinical and patient facing applications to advance personalized, data-driven medicine. By open sourcing a light framework, we hope to spur more development using dynamic ontologies (graph-based data models) and graph APIs that will make for seamless integration of the Flow Health Medical Knowledge Graph.

What are the advantages of using EGF2 with the Flow Health Medical Knowledge Graph?

Data is stored and modeled in exactly the same way, making integration between a solution built on top of EGF2 and the Medical Knowledge Graph super easy. And, integration can be done on various levels: API and Data level, where EGF2 can seamlessly exchange data to and from the Medical Knowledge Graph for analysis providing predictions, decision support and scientific insights.

Even though EGF2 is currently a simplified version of the Flow Health platform, we hope to continue to build and eventually converge our core architecture with EGF2.

Ready to dig into the code? You can find it on Github.

Leave a comment