Interesting Data/Infrastructure Projects

There are so many things happening in this space, its not easy to keep track of many interesting projects. I am trying to compile a list of such interesting projects on Github at https://github.com/dharmeshkakadia/Data-Infra-Projects. While currently its very minimal and obvious list, I plan to add many significant things to it over time, like,

  1. Feature/Performance comparison of different projects with similar goals.
  2. Categorize them better.
  3. Links to understanding these projects better.
    I am sure that I have missed many many interesting projects. So, help me complete it.