Lighthouse is an open-source project to build and manage data lakes. Lighthouse is built on top of Scala and Spark. Using lighthouse, you can define the data sources available in your data lake and easily build data pipelines based on those data sources.
Resources
You can find our tutorial here