- Log in to the Azure portal at https://portal.azure.com ➢ enter Data Factories in the search box ➢ click Data Factories ➢ click the + Create button ➢ select the subscription ➢ select or create a new resource group ➢ select the region ➢ name the data factory ➢ and then click the Next: Git configuration button.
- Select the Configure Git Later check box ➢ click the Next: Networking button ➢ check the Enable Managed Virtual Network check box ➢ leave the Public Endpoint radio button selected ➢ click the Next: Advanced button ➢ review the options, leaving the default settings ➢ click the Review + Create button ➢ review the selections ➢ and then click Create.
- Once the data factory is provisioned, navigate to it, and then click the Open link on the Overview blade in the Open Azure Data Factory Studio tile.
The provisioning of an Azure data factory is straightforward. The one item you have not seen before was on the Advanced tab: the Enable Encryption Using a Customer Managed Key check box. As you might have read in the text on that tab, the data stored in Azure Data Factory is encrypted by default using a Microsoft managed encryption key. If you wanted some additional control over the encryption of the blobs and files stored in Azure Data Factory, you can supply your own managed keys. The key must be stored in Azure Key Vault in order to be used. Clicking the check box results in the rendering of a text box for providing the Azure Key Vault endpoint and a drop‐down text box to select managed identity used for accessing the key stored in the identified vault.
When you access Azure Data Factory Studio, you might notice is how similar the look and feel is as compared to Azure Synapse Analytics Studio. Azure Synapse Studio is the recommended place to perform ingestion from now, but since Azure Data Factory was the predecessor to Azure Synapse, customers have provisioned Azure Data Factory and are dependent on it; therefore, the product remains. New features will be added to Azure Synapse Analytics, and the capabilities that exist in Azure Data Factory will be migrated to Azure Synapse Analytics, until the point where there is likely no visible difference between the two.
You will find three hubs in Azure Data Factory Studio: Manage, Author, and Monitor. The first two are covered here, and Monitor is covered in Chapter 9. Before heading into the hubs, notice on the Home page that there is a tile named Ingest. When you click it, you might notice the same thing shown in Figure 3.60: the Copy Data tool. The Orchestrate tile is the pipeline capability you saw in Figure 3.56, and Transform navigates you to the Data Flows page. Data flows are covered in detail later in this chapter. Many of the features found in Azure Data Factory were already covered in the previous section, so you will find only summaries of the duplicates. Features that are not in Azure Synapse Analytics will be discussed in a bit more detail.
Manage
The Manage hub is the place for creating linked services, IRs, triggers, configuring Git, credentials, and managed endpoints. This is the place to visit to configure the dependencies Azure Data Factory requires to ingest, copy, and transform data.
Connections
This section contains the interface to configure linked services, IRs, and Azure Purview.
LINKED SERVICES
This is a configuration that contains the necessary parameters to make a connection to a data source. You name the linked service and choose the IR performing the actions that require compute power and the connection information. Perform Exercise 3.11 to create a linked service in Azure Data Factory.