Winter Sale - Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: dpm65

DP-203 Data Engineering on Microsoft Azure Questions and Answers

Questions 4

You are implementing Azure Stream Analytics windowing functions.

Which windowing function should you use for each requirement? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 4

Options:

Buy Now
Questions 5

You develop data engineering solutions for a company.

A project requires the deployment of data to Azure Data Lake Storage.

You need to implement role-based access control (RBAC) so that project members can manage the Azure Data Lake Storage resources.

Which three actions should you perform? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Assign Azure AD security groups to Azure Data Lake Storage.

B.

Configure end-user authentication for the Azure Data Lake Storage account.

C.

Configure service-to-service authentication for the Azure Data Lake Storage account.

D.

Create security groups in Azure Active Directory (Azure AD) and add project members.

E.

Configure access control lists (ACL) for the Azure Data Lake Storage account.

Buy Now
Questions 6

A company plans to use Apache Spark analytics to analyze intrusion detection data.

You need to recommend a solution to analyze network and system activity data for malicious activities and policy violations. The solution must minimize administrative efforts.

What should you recommend?

Options:

A.

Azure Data Lake Storage

B.

Azure Databricks

C.

Azure HDInsight

D.

Azure Data Factory

Buy Now
Questions 7

You have an Azure Data Lake Storage account that contains a staging zone.

You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics.

Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that executes mapping data Flow, and then inserts the data info the data warehouse.

Does this meet the goal?

Options:

A.

Yes

B.

No

Buy Now
Questions 8

You have an Azure Stream Analytics job named Job1.

The metrics of Job1 from the last hour are shown in the following table.

DP-203 Question 8

The late arrival tolerance for Job1 is set to the five seconds.

You need to optimize Job1.

Which two actions achieve the goal? Each correct answer presents a complete solution.

NOTE: Each correct answer is worth one point.

Options:

A.

Resolution errors in inputs processing.

B.

Parallelize the query

C.

Resolution errors in output processing

D.

Increase the number of SUs.

Buy Now
Questions 9

You are designing a sales transactions table in an Azure Synapse Analytics dedicated SQL pool. The table will contains approximately 60 million rows per month and will be partitioned by month. The table will use a clustered column store index and round-robin distribution.

Approximately how many rows will there be for each combination of distribution and partition?

Options:

A.

1 million

B.

5 million

C.

20 million

D.

60 million

Buy Now
Questions 10

You are monitoring an Azure Stream Analytics job by using metrics in Azure.

You discover that during the last 12 hours, the average watermark delay is consistently greater than the configured late arrival tolerance.

What is a possible cause of this behavior?

Options:

A.

Events whose application timestamp is earlier than their arrival time by more than five minutes arrive as inputs.

B.

There are errors in the input data.

C.

The late arrival policy causes events to be dropped.

D.

The job lacks the resources to process the volume of incoming data.

Buy Now
Questions 11

You have an Azure Stream Analytics query. The query returns a result set that contains 10,000 distinct values for a column named clusterID.

You monitor the Stream Analytics job and discover high latency.

You need to reduce the latency.

Which two actions should you perform? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Add a pass-through query.

B.

Add a temporal analytic function.

C.

Scale out the query by using PARTITION BY.

D.

Convert the query to a reference query.

E.

Increase the number of streaming units.

Buy Now
Questions 12

You are building an Azure Stream Analytics job to retrieve game data.

You need to ensure that the job returns the highest scoring record for each five-minute time interval of each game.

How should you complete the Stream Analytics query? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 12

Options:

Buy Now
Questions 13

You have the following Azure Stream Analytics query.

DP-203 Question 13

For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selection is worth one point.

DP-203 Question 13

Options:

Buy Now
Questions 14

You have an Azure Synapse workspace named MyWorkspace that contains an Apache Spark database named mytestdb.

You run the following command in an Azure Synapse Analytics Spark pool in MyWorkspace.

CREATE TABLE mytestdb.myParquetTable(

EmployeeID int,

EmployeeName string,

EmployeeStartDate date)

USING Parquet

You then use Spark to insert a row into mytestdb.myParquetTable. The row contains the following data.

DP-203 Question 14

One minute later, you execute the following query from a serverless SQL pool in MyWorkspace.

SELECT EmployeeID

FROM mytestdb.dbo.myParquetTable

WHERE name = 'Alice';

What will be returned by the query?

Options:

A.

24

B.

an error

C.

a null value

Buy Now
Questions 15

Note: The question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it As a result these questions will not appear in the review screen. You have an Azure Data Lake Storage account that contains a staging zone.

You need to design a dairy process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics.

Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that executes a mapping data low. and then inserts the data into the data warehouse.

Does this meet the goal?

Options:

A.

Yes

B.

No

Buy Now
Questions 16

You plan to create a real-time monitoring app that alerts users when a device travels more than 200 meters away from a designated location.

You need to design an Azure Stream Analytics job to process the data for the planned app. The solution must minimize the amount of code developed and the number of technologies used.

What should you include in the Stream Analytics job? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 16

Options:

Buy Now
Questions 17

You have an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 contains a fact table named Tablet. Table1 contains sales data. Sixty-five million rows of data are added to Table1 monthly.

At the end of each month, you need to remove data that is older than 36 months. The solution must minimize how long it takes to remove the data.

How should you partition Table1, and how should you remove the old data? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 17

Options:

Buy Now
Questions 18

You are developing an Azure Synapse Analytics pipeline that will include a mapping data flow named Dataflow1. Dataflow1 will read customer data from an external source and use a Type 1 slowly changing dimension (SCO) when loading the data into a table named DimCustomer1 in an Azure Synapse Analytics dedicated SQL pool.

You need to ensure that Dataflow1 can perform the following tasks:

* Detect whether the data of a given customer has changed in the DimCustomer table.

• Perform an upsert to the DimCustomer table.

Which type of transformation should you use for each task? To answer, select the appropriate options in the answer area

NOTE; Each correct selection is worth one point.

DP-203 Question 18

Options:

Buy Now
Questions 19

What should you recommend using to secure sensitive customer contact information?

Options:

A.

data labels

B.

column-level security

C.

row-level security

D.

Transparent Data Encryption (TDE)

Buy Now
Questions 20

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 20

Options:

Buy Now
Questions 21

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

Options:

A.

a server-level virtual network rule

B.

a database-level virtual network rule

C.

a database-level firewall IP rule

D.

a server-level firewall IP rule

Buy Now
Questions 22

What should you do to improve high availability of the real-time data processing solution?

Options:

A.

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.

Deploy a High Concurrency Databricks cluster.

C.

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.

Set Data Lake Storage to use geo-redundant storage (GRS).

Buy Now
Questions 23

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

Options:

A.

a table that has an IDENTITY property

B.

a system-versioned temporal table

C.

a user-defined SEQUENCE object

D.

a table that has a FOREIGN KEY constraint

Buy Now
Questions 24

You need to design a data storage structure for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 24

Options:

Buy Now
Questions 25

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

DP-203 Question 25

Options:

Buy Now
Questions 26

You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytics requirements.

Which three Transaction-SQL DDL commands should you run in sequence? To answer, move the appropriate commands from the list of commands to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

DP-203 Question 26

Options:

Buy Now
Questions 27

You need to design the partitions for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 27

Options:

Buy Now
Questions 28

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

Options:

A.

Azure-SSIS integration runtime

B.

self-hosted integration runtime

C.

Azure integration runtime

Buy Now
Questions 29

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

DP-203 Question 29

Options:

Buy Now
Questions 30

You need to design an analytical storage solution for the transactional data. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 30

Options:

Buy Now
Questions 31

You build a data warehouse in an Azure Synapse Analytics dedicated SQL pool.

Analysts write a complex SELECT query that contains multiple JOIN and CASE statements to transform data for use in inventory reports. The inventory reports will use the data and additional WHERE parameters depending on the report. The reports will be produced once daily.

You need to implement a solution to make the dataset available for the reports. The solution must minimize query times.

What should you implement?

Options:

A.

a materialized view

B.

a replicated table

C.

in ordered clustered columnstore index

D.

result set chaching

Buy Now
Questions 32

You are building a data flow in Azure Data Factory that upserts data into a table in an Azure Synapse Analytics dedicated SQL pool.

You need to add a transformation to the data flow. The transformation must specify logic indicating when a row from the input data must be upserted into the sink.

Which type of transformation should you add to the data flow?

Options:

A.

join

B.

select

C.

surrogate key

D.

alter row

Buy Now
Questions 33

You have an Azure data factor/ connected to a Git repository that contains the following branches:

• mam: Collaboration branch

• abc: Feature branch

• xyz: Feature branch

You save charges to a pipeline in the xyz branch.

You need to publish the changes to the live service

What should you do first?

Options:

A.

Push the code to a remote origin.

B.

Publish the data factory.

C.

Create a pull request to merge the changes into the abc branch.

D.

Create a pull request to merge the changes into the main branch.

Buy Now
Questions 34

You have an Azure Synapse Analytics dedicated SQL pool.

You need to monitor the database for long-running queries and identify which queries are waiting on resources

Which dynamic manage ment view should you use for each requirement? To answer, select the appropriate options in the answer area.

NOTE; Each correct answer is worth one point.

DP-203 Question 34

Options:

Buy Now
Questions 35

You have an Apache Spark DataFrame named temperatures. A sample of the data is shown in the following table.

DP-203 Question 35

You need to produce the following table by using a Spark SQL query.

DP-203 Question 35

How should you complete the query? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

DP-203 Question 35

Options:

Buy Now
Questions 36

You have an Azure Data Factory pipeline that performs an incremental load of source data to an Azure Data Lake Storage Gen2 account.

Data to be loaded is identified by a column named LastUpdatedDate in the source table.

You plan to execute the pipeline every four hours.

You need to ensure that the pipeline execution meets the following requirements:

    Automatically retries the execution when the pipeline run fails due to concurrency or throttling limits.

    Supports backfilling existing data in the table.

Which type of trigger should you use?

Options:

A.

Storage event

B.

on-demand

C.

schedule

D.

tumbling window

Buy Now
Questions 37

You build an Azure Data Factory pipeline to move data from an Azure Data Lake Storage Gen2 container to a database in an Azure Synapse Analytics dedicated SQL pool.

Data in the container is stored in the following folder structure.

/in/{YYYY}/{MM}/{DD}/{HH}/{mm}

The earliest folder is /in/2021/01/01/00/00. The latest folder is /in/2021/01/15/01/45.

You need to configure a pipeline trigger to meet the following requirements:

    Existing data must be loaded.

    Data must be loaded every 30 minutes.

    Late-arriving data of up to two minutes must he included in the load for the time at which the data should have arrived.

How should you configure the pipeline trigger? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 37

Options:

Buy Now
Questions 38

You have an Azure Synapse Analytics serverless SQL pool named Pool1 and an Azure Data Lake Storage Gen2 account named storage1. The AllowedBlobpublicAccess porperty is disabled for storage1.

You need to create an external data source that can be used by Azure Active Directory (Azure AD) users to access storage1 from Pool1.

What should you create first?

Options:

A.

an external resource pool

B.

a remote service binding

C.

database scoped credentials

D.

an external library

Buy Now
Questions 39

You have an Azure Synapse Analytics dedicated SQL pool.

You need to create a table named FactInternetSales that will be a large fact table in a dimensional model. FactInternetSales will contain 100 million rows and two columns named SalesAmount and OrderQuantity. Queries executed on FactInternetSales will aggregate the values in SalesAmount and OrderQuantity from the last year for a specific product. The solution must minimize the data size and query execution time.

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 39

Options:

Buy Now
Questions 40

You have an Azure Synapse Analytics workspace named WS1.

You have an Azure Data Lake Storage Gen2 container that contains JSON-formatted files in the following format.

DP-203 Question 40

You need to use the serverless SQL pool in WS1 to read the files.

How should you complete the Transact-SQL statement? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

DP-203 Question 40

Options:

Buy Now
Questions 41

You have a Microsoft SQL Server database that uses a third normal form schema.

You plan to migrate the data in the database to a star schema in an Azure Synapse Analytics dedicated SQI pool.

You need to design the dimension tables. The solution must optimize read operations.

What should you include in the solution? to answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 41

Options:

Buy Now
Questions 42

You have an Azure Synapse Analytics dedicated SQL pool.

You need to ensure that data in the pool is encrypted at rest. The solution must NOT require modifying applications that query the data.

What should you do?

Options:

A.

Enable encryption at rest for the Azure Data Lake Storage Gen2 account.

B.

Enable Transparent Data Encryption (TDE) for the pool.

C.

Use a customer-managed key to enable double encryption for the Azure Synapse workspace.

D.

Create an Azure key vault in the Azure subscription grant access to the pool.

Buy Now
Questions 43

You use Azure Stream Analytics to receive Twitter data from Azure Event Hubs and to output the data to an Azure Blob storage account.

You need to output the count of tweets during the last five minutes every five minutes. Each tweet must only

be counted once.

Which windowing function should you use?

Options:

A.

a five-minute Session window

B.

a five-minute Sliding window

C.

a five-minute Tumbling window

D.

a five-minute Hopping window that has one-minute hop

Buy Now
Questions 44

You are creating an Apache Spark job in Azure Databricks that will ingest JSON-formatted data.

You need to convert a nested JSON string into a DataFrame that will contain multiple rows.

Which Spark SQL function should you use?

Options:

A.

explode

B.

filter

C.

coalesce

D.

extract

Buy Now
Questions 45

You have an Azure Data Factory instance that contains two pipelines named Pipeline1 and Pipeline2.

Pipeline1 has the activities shown in the following exhibit.

DP-203 Question 45

Pipeline2 has the activities shown in the following exhibit.

DP-203 Question 45

You execute Pipeline2, and Stored procedure1 in Pipeline1 fails.

What is the status of the pipeline runs?

Options:

A.

Pipeline1 and Pipeline2 succeeded.

B.

Pipeline1 and Pipeline2 failed.

C.

Pipeline1 succeeded and Pipeline2 failed.

D.

Pipeline1 failed and Pipeline2 succeeded.

Buy Now
Questions 46

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Data Lake Storage account that contains a staging zone.

You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics.

Solution: You schedule an Azure Databricks job that executes an R notebook, and then inserts the data into the data warehouse.

Does this meet the goal?

Options:

A.

Yes

B.

No

Buy Now
Questions 47

You have an on-premises data warehouse that includes the following fact tables. Both tables have the following columns: DateKey, ProductKey, RegionKey. There are 120 unique product keys and 65 unique region keys.

DP-203 Question 47

Queries that use the data warehouse take a long time to complete.

You plan to migrate the solution to use Azure Synapse Analytics. You need to ensure that the Azure-based solution optimizes query performance and minimizes processing skew.

What should you recommend? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point

DP-203 Question 47

Options:

Buy Now
Questions 48

You have several Azure Data Factory pipelines that contain a mix of the following types of activities.

* Wrangling data flow

* Notebook

* Copy

* jar

Which two Azure services should you use to debug the activities? Each correct answer presents part of the solution NOTE: Each correct selection is worth one point.

Options:

A.

Azure HDInsight

B.

Azure Databricks

C.

Azure Machine Learning

D.

Azure Data Factory

E.

Azure Synapse Analytics

Buy Now
Questions 49

You are designing a statistical analysis solution that will use custom proprietary1 Python functions on near real-time data from Azure Event Hubs.

You need to recommend which Azure service to use to perform the statistical analysis. The solution must minimize latency.

What should you recommend?

Options:

A.

Azure Stream Analytics

B.

Azure SQL Database

C.

Azure Databricks

D.

Azure Synapse Analytics

Buy Now
Exam Code: DP-203
Exam Name: Data Engineering on Microsoft Azure
Last Update: Jan 15, 2025
Questions: 351

PDF + Testing Engine

$61.25  $174.99

Testing Engine

$47.25  $134.99
buy now DP-203 testing engine

PDF (Q&A)

$40.25  $114.99
buy now DP-203 pdf
dumpsmate guaranteed to pass
24/7 Customer Support

DumpsMate's team of experts is always available to respond your queries on exam preparation. Get professional answers on any topic of the certification syllabus. Our experts will thoroughly satisfy you.

Site Secure

mcafee secure

TESTED 18 Jan 2025