{"id":8966,"date":"2021-06-10T09:24:31","date_gmt":"2021-06-10T08:24:31","guid":{"rendered":"https:\/\/ee.yelkdev.site\/?p=8966"},"modified":"2024-03-28T13:48:28","modified_gmt":"2024-03-28T13:48:28","slug":"six-essential-practices-of-data-pipelines","status":"publish","type":"post","link":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/","title":{"rendered":"Six essential practices of data pipelines"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">A carefully managed data pipeline can provide you with seamless access to reliable and well-structured datasets. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">A generalised form of transferring data from a source system A to a source system B, data pipelines are developed in small pieces, and integrated with data, logic and algorithms to perform complex transformations. To do this effectively, there are some essential practices that need to be adhered to.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In our <\/span><a href=\"https:\/\/playbooks.equalexperts.com\/data-pipeline\"><span style=\"font-weight: 400;\">data pipeline playbook <\/span><\/a><span style=\"font-weight: 400;\">we have identified eleven practices to follow when creating a data pipeline.\u00a0 Here we touch on six of these practices such as how to start by using a steel thread, and in our next blog post we will talk about iteratively creating your data models as well as observing the pipeline.\u00a0 Applying these practices will allow you to integrate new data sources faster at a higher quality as outlined in our recent post on the <\/span><a href=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/what-are-the-benefits-of-data-pipelines\/\"><span style=\"font-weight: 400;\">benefits of a data pipeline<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h2><b>About this series<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">This is part four in our six-part series on the data pipeline, taken from our<\/span><a href=\"https:\/\/playbooks.equalexperts.com\/data-pipeline\"><span style=\"font-weight: 400;\"> latest playbook<\/span><\/a><span style=\"font-weight: 400;\">. First we looked at the basics, in <\/span><a href=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/what-is-a-data-pipeline\/\"><span style=\"font-weight: 400;\">What is a data pipeline<\/span><\/a><span style=\"font-weight: 400;\">. Next we looked at the <\/span><a href=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/what-are-the-benefits-of-data-pipelines\/\"><span style=\"font-weight: 400;\">six main benefits of an effective data pipeline<\/span><\/a><span style=\"font-weight: 400;\">. In part three we considered the \u2018must have\u2019 <\/span><a href=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-must-have-key-principles-of-data-pipeline-projects\/\"><span style=\"font-weight: 400;\">key principles of data pipeline projects<\/span><\/a><span style=\"font-weight: 400;\">. Now we look at the six key practices needed for a data pipeline. Before we get into the details we just want to cover off what\u2019s coming in the rest of the series. In <a href=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/five-more-practices-that-will-ensure-a-successful-data-pipeline-project\/\">part five<\/a> we look at more of those practices, and in <a href=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/common-pitfalls-of-data-pipeline-projects-and-how-to-avoid-them\/\">part six<\/a> we look at the many pitfalls you can encounter in a data pipeline project.\u00a0<\/span><\/p>\n<h2><b>The growing need for good data engineering<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Today, data engineers serve a wider audience than just a few years ago. As there is a growing need for organisations to apply machine learning techniques to their data, new challenges are faced by data engineers in order to remain relevant. Essential to every project is the ability to reliably deliver large-volume data sets so that data scientists can train more accurate models.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Aside from dealing with larger data volumes, these pipelines need to be flexible in order to accommodate the variety of data and the increasingly high processing velocity required. The following practices are those that we feel are essential to successful projects, the minimum requirement for success. They are based on our collective knowledge and experience gained across <\/span><span style=\"font-weight: 400;\">many data pipeline engagements.\u00a0\u00a0<\/span><\/p>\n<h2><b>Practice 1: Build for the right latency<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">When designing the pipeline, it\u2019s important to consider what level of latency you need. What is your speed of decision? How quickly do you need the data? Building and running a low latency, real-time data pipeline will be significantly more expensive, so make sure that you know you need one before embarking on that path. You should also ask how fast your pipeline can be. Is it even possible for you to have a real-time data pipeline? If all your data sources are produced by daily batch jobs, then the best latency you can reach will be daily updates, and the extra cost of real-time implementations will not provide any business benefits.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If you do need to be within real-time or near real-time, then this needs to be a key factor at each step of the pipeline. The speed of the pipe is conditioned by the speed of the slowest stage.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">And be careful not to confuse the need for a real-time decision engine with the need for a real-time historical data store, such as a data warehouse for the data scientists. Decision models are created from stores of historical data and need to be validated before deployment into production. Model release usually takes place at a slower cadence (e.g., weekly or monthly). Of course, the deployed model will need to work on a live data stream, but we consider this part of the application development. This is not the appropriate use for a data warehouse or similar.<\/span><\/p>\n<h2><b>Practice 2: Keep raw data<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Ingestions should start by storing raw data in the pipeline without making any changes. In most environments, data storage is cheap, and it is common to have all the ingested data persisted and unchanged. Typically, this is done via cloud file storage (S3, GCP Cloud Storage, Azure Storage), or HDFS for on-premise data.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Keeping this data allows you to reprocess it without re-ingestion if any business rule changes, and it also retains the possibility of new pipelines based on this data if, for example, a new dashboard is needed.<\/span><\/p>\n<h2><b>Practice 3: Break transformations into small tasks<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Pipelines are usually composed of several transformations of the data, activities such as format validation, conformance against master data, enrichment, imputation of missing values, etc. Data pipelines are no different from other software and should thus follow modern software development practices of breaking down software units into small reproducible tasks. Each task should target a single output and be deterministic and idempotent. If we run a transformation on the same data multiple times, the results should always be the same.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">By creating easily tested tasks, we increase the quality and confidence in the pipeline, as well as enhance the pipeline maintainability. If we need to add or change something on the transformation, we have the guarantee that if we rerun it, the only changes will be the ones we made.<\/span><\/p>\n<h2><b>Practice 4: Support backfilling<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">If the pipelines are mature at the start of development, it may not be possible to fully evaluate whether the pipeline is working correctly or not. Is this metric unusual because this is what always happens on Mondays, or is it a fault in the pipeline? We may well find at a later date that some of the ingested data was incorrect. Imagine you find out that during a month, a source was reporting incorrect results, but for the rest of the time, the data was correct.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We should engineer our pipelines so that we can correct them as our understanding of the dataflows matures. We should be able to backfill the stored data when we have identified a problem in the source or at some point in the pipeline, and ideally, it should be possible to backfill just for the corresponding period of time, leaving the data for other periods untouched.<\/span><\/p>\n<h2><b>Practice 5: Start with a steel thread<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">When starting at a greenfield site, we typically build up data pipelines iteratively around a steel thread \u2013 first a thin data pipe which is a thin slice through the architecture. This progressively validates the quality and security of the data. The first thread creates an initial point of value &#8211; probably a single data source, with some limited processing, stored where it can be accessed by at least one data user. The purpose of this first thread is to provide an initial path to data and uncover unexpected blockers, so it is selected for simplicity rather than having the highest end-user value. Bear in mind that in the first iteration, you will need to:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Create a cloud environment which meets the organisation&#8217;s information security needs.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Set up the continuous development environment.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Create an appropriate test framework.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Model the data and create the first schemas in a structured data store.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Coach end users on how to access the data.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Implement simple monitoring of the pipeline.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Later iterations will bring in more data sources and provide access to wider groups of users, as well as bringing in more complex functionality such as:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Including sources of reference or master data.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Advanced monitoring and alerting.<\/span><\/li>\n<\/ul>\n<h2><b>Practice 6: Utilise cloud \u2013 define your pipelines with infrastructure-as-code<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Pipelines are a mixture of infrastructure (e.g., hosting services, databases, etc.), processing code, and scripting\/configuration. They can be implemented using proprietary and\/or open-source technologies. However, all of the cloud providers have excellent cloud native services for defining, operating and monitoring data pipelines. They are usually superior in terms of their ability to scale with increasing volumes, simpler to configure and operate, and support a more agile approach to data architecture.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Whichever solution is adopted, since pipelines are a mixture of components, it is critical to adopt an infrastructure-as-code approach. Only by having the pipeline defined and built using tools, such as terraform, and source controlled in a repository, will pipeline owners have control over the pipeline and the confidence to rebuild and refine it as needed.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Hopefully this gives a clearer overview of some of the essential practices needed to create an effective data pipeline. In the next blog post in this series, we will outline more of the practices needed for data pipelines.\u00a0 Until then, for more information on data pipelines in general, take a look at our <\/span><a href=\"https:\/\/playbooks.equalexperts.com\/data-pipeline\"><span style=\"font-weight: 400;\">Data Pipeline Playbook<\/span><span style=\"font-weight: 400;\">.<\/span><\/a><span style=\"font-weight: 400;\">\u00a0\u00a0<\/span><\/p>\n<h2><b>Contact us!<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">If you\u2019d like us to share our experience of data pipelines with you, get in touch using the form below.<\/span><\/p>\n\n\t\t\t\t\t\t<script>\n\t\t\t\t\t\t\twindow.hsFormsOnReady = window.hsFormsOnReady || [];\n\t\t\t\t\t\t\twindow.hsFormsOnReady.push(()=>{\n\t\t\t\t\t\t\t\thbspt.forms.create({\n\t\t\t\t\t\t\t\t\tportalId: 7208712,\n\t\t\t\t\t\t\t\t\tformId: \"83acdf22-cf43-47ba-b91f-0428264b824a\",\n\t\t\t\t\t\t\t\t\ttarget: \"#hbspt-form-1758976743000-6806697277\",\n\t\t\t\t\t\t\t\t\tregion: \"eu1\",\n\t\t\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t})});\n\t\t\t\t\t\t<\/script>\n\t\t\t\t\t\t<div class=\"hbspt-form\" id=\"hbspt-form-1758976743000-6806697277\"><\/div>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here we introduce six of the practices you should follow when creating data pipelines.  From how to use raw data to starting with a steel thread, applying these practices will allow you to integrate new data sources faster at a higher quality.<\/p>\n","protected":false},"author":164,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","inline_featured_image":false,"footnotes":""},"categories":[5],"tags":[188,187,186,192],"location":[397],"class_list":["post-8966","post","type-post","status-publish","format-standard","hentry","category-our-thinking","tag-data-engineering","tag-data-management","tag-data-pipelines","tag-data-pipelines-playbook"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.9 (Yoast SEO v25.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Six essential practices of data pipelines | Equal Experts<\/title>\n<meta name=\"description\" content=\"Six practices to follow when creating data pipelines. From support backfilling to utilising cloud, these allow faster integration and higher quality data.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Six essential practices of data pipelines\" \/>\n<meta property=\"og:description\" content=\"Here we introduce six of the practices you should follow when creating data pipelines. From how to use raw data to starting with a steel thread, applying these practices will allow you to integrate new data sources faster at a higher quality.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/\" \/>\n<meta property=\"og:site_name\" content=\"Equal Experts\" \/>\n<meta property=\"article:published_time\" content=\"2021-06-10T08:24:31+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-03-28T13:48:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2021\/04\/datapipeline_blog4_fb.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"630\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Cl\u00e1udio Diniz\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:title\" content=\"Six essential practices of data pipelines\" \/>\n<meta name=\"twitter:description\" content=\"Here we introduce six of the practices you should follow when creating data pipelines. From how to use raw data to starting with a steel thread, applying these practices will allow you to integrate new data sources faster at a higher quality.\" \/>\n<meta name=\"twitter:creator\" content=\"@EqualExperts\" \/>\n<meta name=\"twitter:site\" content=\"@EqualExperts\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Cl\u00e1udio Diniz\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/\"},\"author\":{\"name\":\"Cl\u00e1udio Diniz\",\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/person\/28ff89d676b184c93ab62bc91b0af11e\"},\"headline\":\"Six essential practices of data pipelines\",\"datePublished\":\"2021-06-10T08:24:31+00:00\",\"dateModified\":\"2024-03-28T13:48:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/\"},\"wordCount\":1485,\"publisher\":{\"@id\":\"https:\/\/www.equalexperts.com\/#organization\"},\"keywords\":[\"data engineering\",\"data management\",\"data pipelines\",\"data pipelines playbook\"],\"articleSection\":[\"Our Thinking\"],\"inLanguage\":\"en-GB\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/\",\"url\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/\",\"name\":\"Six essential practices of data pipelines | Equal Experts\",\"isPartOf\":{\"@id\":\"https:\/\/www.equalexperts.com\/#website\"},\"datePublished\":\"2021-06-10T08:24:31+00:00\",\"dateModified\":\"2024-03-28T13:48:28+00:00\",\"description\":\"Six practices to follow when creating data pipelines. From support backfilling to utilising cloud, these allow faster integration and higher quality data.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.equalexperts.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Six essential practices of data pipelines\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.equalexperts.com\/#website\",\"url\":\"https:\/\/www.equalexperts.com\/\",\"name\":\"Equal Experts\",\"description\":\"Making Software. Better.\",\"publisher\":{\"@id\":\"https:\/\/www.equalexperts.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.equalexperts.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.equalexperts.com\/#organization\",\"name\":\"Equal Experts\",\"url\":\"https:\/\/www.equalexperts.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/08\/Equal_Experts_Logo_CMYK_Colour.jpg\",\"contentUrl\":\"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/08\/Equal_Experts_Logo_CMYK_Colour.jpg\",\"width\":719,\"height\":340,\"caption\":\"Equal Experts\"},\"image\":{\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/EqualExperts\",\"https:\/\/www.linkedin.com\/company\/equal-experts\/?viewAsMember=true\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/person\/28ff89d676b184c93ab62bc91b0af11e\",\"name\":\"Cl\u00e1udio Diniz\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\/\/www.equalexperts.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/d70fbe38b0540d312610b719e2e75bc9f302aafe3264bf1eb8174eb191c4879d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/d70fbe38b0540d312610b719e2e75bc9f302aafe3264bf1eb8174eb191c4879d?s=96&d=mm&r=g\",\"caption\":\"Cl\u00e1udio Diniz\"}}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Six essential practices of data pipelines | Equal Experts","description":"Six practices to follow when creating data pipelines. From support backfilling to utilising cloud, these allow faster integration and higher quality data.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/","og_locale":"en_GB","og_type":"article","og_title":"Six essential practices of data pipelines","og_description":"Here we introduce six of the practices you should follow when creating data pipelines. From how to use raw data to starting with a steel thread, applying these practices will allow you to integrate new data sources faster at a higher quality.","og_url":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/","og_site_name":"Equal Experts","article_published_time":"2021-06-10T08:24:31+00:00","article_modified_time":"2024-03-28T13:48:28+00:00","og_image":[{"width":1200,"height":630,"url":"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2021\/04\/datapipeline_blog4_fb.png","type":"image\/png"}],"author":"Cl\u00e1udio Diniz","twitter_card":"summary_large_image","twitter_title":"Six essential practices of data pipelines","twitter_description":"Here we introduce six of the practices you should follow when creating data pipelines. From how to use raw data to starting with a steel thread, applying these practices will allow you to integrate new data sources faster at a higher quality.","twitter_creator":"@EqualExperts","twitter_site":"@EqualExperts","twitter_misc":{"Written by":"Cl\u00e1udio Diniz","Estimated reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/#article","isPartOf":{"@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/"},"author":{"name":"Cl\u00e1udio Diniz","@id":"https:\/\/www.equalexperts.com\/#\/schema\/person\/28ff89d676b184c93ab62bc91b0af11e"},"headline":"Six essential practices of data pipelines","datePublished":"2021-06-10T08:24:31+00:00","dateModified":"2024-03-28T13:48:28+00:00","mainEntityOfPage":{"@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/"},"wordCount":1485,"publisher":{"@id":"https:\/\/www.equalexperts.com\/#organization"},"keywords":["data engineering","data management","data pipelines","data pipelines playbook"],"articleSection":["Our Thinking"],"inLanguage":"en-GB"},{"@type":"WebPage","@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/","url":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/","name":"Six essential practices of data pipelines | Equal Experts","isPartOf":{"@id":"https:\/\/www.equalexperts.com\/#website"},"datePublished":"2021-06-10T08:24:31+00:00","dateModified":"2024-03-28T13:48:28+00:00","description":"Six practices to follow when creating data pipelines. From support backfilling to utilising cloud, these allow faster integration and higher quality data.","breadcrumb":{"@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.equalexperts.com\/blog\/our-thinking\/six-essential-practices-of-data-pipelines\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.equalexperts.com\/"},{"@type":"ListItem","position":2,"name":"Six essential practices of data pipelines"}]},{"@type":"WebSite","@id":"https:\/\/www.equalexperts.com\/#website","url":"https:\/\/www.equalexperts.com\/","name":"Equal Experts","description":"Making Software. Better.","publisher":{"@id":"https:\/\/www.equalexperts.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.equalexperts.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":"Organization","@id":"https:\/\/www.equalexperts.com\/#organization","name":"Equal Experts","url":"https:\/\/www.equalexperts.com\/","logo":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/www.equalexperts.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/08\/Equal_Experts_Logo_CMYK_Colour.jpg","contentUrl":"https:\/\/www.equalexperts.com\/wp-content\/uploads\/2018\/08\/Equal_Experts_Logo_CMYK_Colour.jpg","width":719,"height":340,"caption":"Equal Experts"},"image":{"@id":"https:\/\/www.equalexperts.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/EqualExperts","https:\/\/www.linkedin.com\/company\/equal-experts\/?viewAsMember=true"]},{"@type":"Person","@id":"https:\/\/www.equalexperts.com\/#\/schema\/person\/28ff89d676b184c93ab62bc91b0af11e","name":"Cl\u00e1udio Diniz","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/www.equalexperts.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/d70fbe38b0540d312610b719e2e75bc9f302aafe3264bf1eb8174eb191c4879d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d70fbe38b0540d312610b719e2e75bc9f302aafe3264bf1eb8174eb191c4879d?s=96&d=mm&r=g","caption":"Cl\u00e1udio Diniz"}}]}},"_links":{"self":[{"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/posts\/8966","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/users\/164"}],"replies":[{"embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/comments?post=8966"}],"version-history":[{"count":0,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/posts\/8966\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/media?parent=8966"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/categories?post=8966"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/tags?post=8966"},{"taxonomy":"location","embeddable":true,"href":"https:\/\/www.equalexperts.com\/wp-json\/wp\/v2\/location?post=8966"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}