HOW TO BUILD ETL PIPELINE USING POWER APPS DATAFLOW
Just like other data integration technologies (e.g., Azure Data Factory, SSIS, etc.), a cloud-based Software-as-a-Service in Microsoft PowerApps called “Dataflow” can help you migrate millions of data from multiple sources to multiple destinations. With the help of dataflow in Power Apps, you can easily ingest, cleanse, transform, integrate, enrich, and schematize with business data from various sources and load them into Microsoft Dataverse. For instance, your SQL database records can be migrated to Dataverse and used as a data source in the canvas app, model-driven app, Power BI visualizations, and lots more.
Generally, Power Apps dataflow also uses the Extract, Transform, and Load (ETL) methodology to set up data integration pipelines in Power Apps.
Power Apps Dataflow Architecture
Basic Steps to Using Dataflow in Power Apps:
- In Power Apps, dataflow is created and managed within an environment just like other Power Apps resources. There are several data sources you can integrate your dataflow pipeline in Power Apps:
After selecting your data source above, you are alerted to create a connection to that source, including the account to use when connecting to the data source.
- Once you have connected to the data source above, you can select the data for your table. After this selection, the Power Platform Dataflow service will subsequently reconnect to the data source to keep the data in your dataflow refreshed, at the frequency you select later in the setup process. This is the Extraction of the data stage
2. You can now transform your data using the Power Query engine. For instance, you can add or modify queries, manage queries by grouping or adding descriptions to query steps, add or modify columns, etc. This is the Transformation stage.
3. In the Load stage, this is where you load the data to the target storage. The mapping is also part of loading the data, where you map a source column to a destination column based on the uniqueness of the data and constraints or create a new table in the destination based on the transformed data.
Once the mapping is defined, you can schedule the refresh frequency for each of your connected data sources. Dataflows use a data refresh process to keep data up to date. Keep in mind the following when setting the refresh:
a. In the Power Platform Dataflow authoring tool, you can choose to refresh your dataflow manually or automatically on a scheduled interval of your choice. To schedule a refresh automatically, select Refresh automatically.
b. Enter the dataflow refresh frequency, start date, and time, in UTC.
c. Select Create.
After the refresh time completes, the data is loaded into the destination table. An example of dataflow data in Dataverse is shown below:
Sounds interesting, yeah? 😊😊
To learn more above Power Apps Dataflow, you can also refer to the Microsoft documentation below:
https://learn.microsoft.com/en-us/power-apps/maker/data-platform/create-and-use-dataflows