Roles and responsibilities
As a data architect you are expected to have solid data management experience (Data modelling, Data Quality, ETL and Reporting) and use your experience to support day-to-day data science activities as well as coming up with initiatives that can enhance the work experience in data domain by leveraging data quality and standardizing modelling and ETL designs.
The role requires strong partnership with data scientists, data engineers, data analysts and reporting teams, by demonstrating strong interpersonal and communication skills as well as project and stakeholder management.
What’s On Your Plate?
As a Data Architect, you will be responsible for two scopes of work:
- Data Units Support
- Act as a data management consultant for our data science teams:
- Guide them to create data models and ETL pipelines that best serves their use case following best practices in terms of performance, quality and cost.
- Review production deployment requests and advice changes to follow the set standards.
- Support the team in tuning their data solutions.
- Align priorities and plan weekly/monthly activities.
- Horizontal Initiatives
- Identify gaps and shortfalls and build end-to-end data solutions that fills the gaps in terms of:
- Reducing data platform activities cost.
- Enhancing data pipelines performance.
- Build/Maintain data products that supports data quality and governance program
- Build/Enhance data quality gates solutions.
- Identify data quality and governance shortfalls, perform root cause analysis and advice a solution and monitoring mechanisms.
- On a team rotational basis, act as a production support guardian for the entire data platform ETL pipelines on your turn.
- Partner with our data engineering and data reporting teams and work together to better support the data units and align on horizontal initiatives across the data platform.
What you need to be successful
What Did We Order?
- Advanced working SQL knowledge and experience working with relational databases, as well as working familiarity with a variety of databases.
- Solid data modelling experience with 5-8 years of experience.
- Experience building and optimizing data pipelines, data warehouse architectures, and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- A good problem solver with a ‘figure it out’ growth mindset.
- An excellent collaborator.
- An excellent communicator.
- A strong sense of ownership and accountability.
- A ‘keep it simple’ approach to #makeithappen.
- Bachelor's degree in engineering, computer science, technology, or similar fields.
Desired candidate profile
1. Data Architecture and Design
- Expertise in designing and structuring complex data systems that support business needs, ensuring they are scalable, flexible, and efficient.
- Understanding of data models, schemas, and data structures like relational, NoSQL, and cloud-based systems.
- Experience with designing databases (OLTP, OLAP), data lakes, and data warehouses to handle structured and unstructured data.
2. Database Technologies
- Strong knowledge of relational databases (e.g., MySQL, PostgreSQL, Oracle), NoSQL databases (e.g., MongoDB, Cassandra), and graph databases.
- Experience with cloud-based database technologies like Amazon RDS, Google BigQuery, Azure SQL Database, and Snowflake.
- Proficiency in database management, including performance tuning, indexing, backup, and disaster recovery planning.
3. Data Modeling and ETL Processes
- Expertise in designing logical and physical data models that are optimized for querying and reporting.
- Experience in building and managing ETL (Extract, Transform, Load) pipelines to move data between systems.
- Familiarity with data integration tools like Apache Nifi, Talend, and Informatica.
4. Cloud Computing and Distributed Systems
- Proficiency in cloud platforms (e.g., AWS, Google Cloud Platform, Microsoft Azure) for deploying and managing data infrastructure.
- Knowledge of distributed computing frameworks like Apache Hadoop, Spark, and Flink for processing large volumes of data.
- Experience designing cloud-based data storage systems that allow for scalability, cost-effectiveness, and high availability.
5. Big Data Technologies
- Familiarity with big data tools like Apache Kafka, Hadoop, and Spark for managing and processing large datasets in real-time.
- Understanding of batch processing and stream processing architectures to ensure data is processed and analyzed in a timely manner.
6. Data Governance and Security
- Knowledge of data governance best practices, ensuring data quality, integrity, and compliance with legal regulations (e.g., GDPR, HIPAA).
- Expertise in implementing data security measures, including encryption, access controls, and audit logs, to protect sensitive data.