Hi everyone, I came back before a long time, recently i received a request build a new dataware-house system at Zitga(game-studio). Several criteria that the system needs to meet :
– Crawling data from multiple resource (bigquery, appsflyer, ironsrc, appstore, playstore, server-to-server ….).
– The system will replace the current google bigquery is usage.
– The size of system about 5TB -> 40TB.
– Build tool to support datSa analysts (query, build machine learning, dashboard).
– Data collection history.
– Open ready for integration with others system.
Zitga dataware house architecture
HDFS : distributed system storage, data on multiple nodes.
Hive : reading, writing, and managing large datasets residing in distributed storage using SQL.
Spark : Spark-SQL and SparkML to build machine learning model over Hive tables.
Hue : Open source SQL assitant for databases and data warehouses.
Tableau : BI system, analytics platform.
Crawl manager : Management, scheduling data collection from multiple resources.
Sizing of cluster
********************************
Design resource : https://drive.google.com/file/d/1JCgx1AT6podIU3Ra5cZGkDdDwNs8a1je/view?fbclid=IwAR1ZeyiaSzn-FCZJpJDcOj11bxcDsRN2296mT0tc8gnQM2EN-CRZyPKiMVs
Such an informative material you shared.. Thank You
Data warehouse solutions in Saudi Arabia
Data warehouse services in Saudi Arabia
LikeLike
😃❤️
LikeLike
Hi bro Hieu, Im curious about your Crawl manager & Import Engine. May I know what kind of technical did u build these? Thanks.
LikeLike