Weekend Special Sale - 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: dm70dm

DP-203 Data Engineering on Microsoft Azure Questions and Answers

Questions 4

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

Options:

A.

Azure-SSIS integration runtime

B.

self-hosted integration runtime

C.

Azure integration runtime

Buy Now
Questions 5

You need to design the partitions for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 5

Options:

Buy Now
Questions 6

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

DP-203 Question 6

Options:

Buy Now
Questions 7

You need to design an analytical storage solution for the transactional data. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 7

Options:

Buy Now
Questions 8

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

Options:

A.

a table that has an IDENTITY property

B.

a system-versioned temporal table

C.

a user-defined SEQUENCE object

D.

a table that has a FOREIGN KEY constraint

Buy Now
Questions 9

You need to design a data retention solution for the Twitter feed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

Options:

A.

change feed

B.

soft delete

C.

time-based retention

D.

lifecycle management

Buy Now
Questions 10

What should you recommend using to secure sensitive customer contact information?

Options:

A.

data labels

B.

column-level security

C.

row-level security

D.

Transparent Data Encryption (TDE)

Buy Now
Questions 11

What should you do to improve high availability of the real-time data processing solution?

Options:

A.

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.

Deploy a High Concurrency Databricks cluster.

C.

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.

Set Data Lake Storage to use geo-redundant storage (GRS).

Buy Now
Questions 12

You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytics requirements.

Which three Transaction-SQL DDL commands should you run in sequence? To answer, move the appropriate commands from the list of commands to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

DP-203 Question 12

Options:

Buy Now
Questions 13

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 13

Options:

Buy Now
Questions 14

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

Options:

A.

a server-level virtual network rule

B.

a database-level virtual network rule

C.

a database-level firewall IP rule

D.

a server-level firewall IP rule

Buy Now
Questions 15

You have a Microsoft SQL Server database that uses a third normal form schema.

You plan to migrate the data in the database to a star schema in an Azure Synapse Analytics dedicated SQI pool.

You need to design the dimension tables. The solution must optimize read operations.

What should you include in the solution? to answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 15

Options:

Buy Now
Questions 16

You are designing an Azure Stream Analytics job to process incoming events from sensors in retail environments.

You need to process the events to produce a running average of shopper counts during the previous 15 minutes, calculated at five-minute intervals.

Which type of window should you use?

Options:

A.

snapshot

B.

tumbling

C.

hopping

D.

sliding

Buy Now
Questions 17

You have an Azure Data Lake Storage Gen2 account named account1 that contains a container named Container"1. Container1 contains two folders named FolderA and FolderB.

You need to configure access control lists (ACLs) to meet the following requirements:

• Group1 must be able to list and read the contents and subfolders of FolderA.

• Group2 must be able to list and read the contents of FolderA and FolderB.

• Group2 must be prevented from reading any other folders at the root of Container1.

How should you configure the ACL permissions for each group? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

DP-203 Question 17

Options:

Buy Now
Questions 18

You have an Azure subscription that contains the Azure Synapse Analytics workspaces shown in the following table.

DP-203 Question 18

Each workspace must read and write data to datalake1.

Each workspace contains an unused Apache Spark pool.

You plan to configure each Spark pool to share catalog objects that reference datalakel For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE Each correct selection is worth one point.

DP-203 Question 18

Options:

Buy Now
Questions 19

You have an Azure subscription that contains the following resources:

An Azure Active Directory (Azure AD) tenant that contains a security group named Group1

An Azure Synapse Analytics SQL pool named Pool1

You need to control the access of Group1 to specific columns and rows in a table in Pool1.

Which Transact-SQL commands should you use? To answer, select the appropriate options in the answer area.

DP-203 Question 19

Options:

Buy Now
Questions 20

You have an Azure Synapse Analystics dedicated SQL pool that contains a table named Contacts. Contacts contains a column named Phone.

You need to ensure that users in a specific role only see the last four digits of a phone number when querying the Phone column.

What should you include in the solution?

Options:

A.

a default value

B.

dynamic data masking

C.

row-level security (RLS)

D.

column encryption

E.

table partitions

Buy Now
Questions 21

You are building a data flow in Azure Data Factory that upserts data into a table in an Azure Synapse Analytics dedicated SQL pool.

You need to add a transformation to the data flow. The transformation must specify logic indicating when a row from the input data must be upserted into the sink.

Which type of transformation should you add to the data flow?

Options:

A.

join

B.

select

C.

surrogate key

D.

alter row

Buy Now
Questions 22

You have an Azure data factory named ADM that contains a pipeline named Pipelwe1

Pipeline! must execute every 30 minutes with a 15-minute offset.

Vou need to create a trigger for Pipehne1. The trigger must meet the following requirements:

• Backfill data from the beginning of the day to the current time.

• If Pipeline1 fairs, ensure that the pipeline can re-execute within the same 30-mmute period.

• Ensure that only one concurrent pipeline execution can occur.

• Minimize de4velopment and configuration effort

Which type of trigger should you create?

Options:

A.

schedule

B.

event-based

C.

manual

D.

tumbling window

Buy Now
Questions 23

You are designing an Azure Synapse Analytics dedicated SQL pool.

Groups will have access to sensitive data in the pool as shown in the following table.

DP-203 Question 23

You have policies for the sensitive data. The policies vary be region as shown in the following table.

DP-203 Question 23

You have a table of patients for each region. The tables contain the following potentially sensitive columns.

DP-203 Question 23

You are designing dynamic data masking to maintain compliance.

For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selection is worth one point.

DP-203 Question 23

Options:

Buy Now
Questions 24

You have an Azure subscription that contains an Azure data factory.

You are editing an Azure Data Factory activity JSON.

The script needs to copy a file from Azure Blob Storage to multiple destinations. The solution must ensure that the source and destination files have consistent folder paths.

How should you complete the script? To answer, drag the appropriate values to the correct targets Each value may be used once, more than once, or not at all You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point

DP-203 Question 24

Options:

Buy Now
Questions 25

You have an Azure subscription that contains a Microsoft Purview account.

You need to search the Microsoft Purview Data Catalog to identify assets that have an assetType property of Table or View

Which query should you run?

Options:

A.

assetType IN (Table', 'View')

B.

assetType:Table OR assetType:View

C.

assetType - (Table or view)

D.

assetType:(Table OR View)

Buy Now
Questions 26

You are designing a financial transactions table in an Azure Synapse Analytics dedicated SQL pool. The table will have a clustered columnstore index and will include the following columns:

TransactionType: 40 million rows per transaction type

CustomerSegment: 4 million per customer segment

TransactionMonth: 65 million rows per month

AccountType: 500 million per account type

You have the following query requirements:

Analysts will most commonly analyze transactions for a given month.

Transactions analysis will typically summarize transactions by transaction type, customer segment, and/or account type

You need to recommend a partition strategy for the table to minimize query times.

On which column should you recommend partitioning the table?

Options:

A.

CustomerSegment

B.

AccountType

C.

TransactionType

D.

TransactionMonth

Buy Now
Questions 27

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are designing an Azure Stream Analytics solution that will analyze Twitter data.

You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.

Solution: You use a hopping window that uses a hop size of 10 seconds and a window size of 10 seconds.

Does this meet the goal?

Options:

A.

Yes

B.

No

Buy Now
Questions 28

You are incrementally loading data into fact tables in an Azure Synapse Analytics dedicated SQL pool.

Each batch of incoming data is staged before being loaded into the fact tables. |

You need to ensure that the incoming data is staged as quickly as possible. |

How should you configure the staging tables? To answer, select the appropriate options in the answer area.

DP-203 Question 28

Options:

Buy Now
Questions 29

You have an Azure Data Factory pipeline named pipeline1 that is invoked by a tumbling window trigger named Trigger1. Trigger1 has a recurrence of 60 minutes.

You need to ensure that pipeline1 will execute only if the previous execution completes successfully.

How should you configure the self-dependency for Trigger1?

Options:

A.

offset: "-00:01:00" size: "00:01:00"

B.

offset: "01:00:00" size: "-01:00:00"

C.

offset: "01:00:00" size: "01:00:00"

D.

offset: "-01:00:00" size: "01:00:00"

Buy Now
Questions 30

You plan to create an Azure Synapse Analytics dedicated SQL pool.

You need to minimize the time it takes to identify queries that return confidential information as defined by the company's data privacy regulations and the users who executed the queues.

Which two components should you include in the solution? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.

sensitivity-classification labels applied to columns that contain confidential information

B.

resource tags for databases that contain confidential information

C.

audit logs sent to a Log Analytics workspace

D.

dynamic data masking for columns that contain confidential information

Buy Now
Questions 31

You have an Azure subscription that contains an Azure Cosmos DB analytical store and an Azure Synapse Analytics workspace named WS 1. WS1 has a serverless SQL pool name Pool1.

You execute the following query by using Pool1.

DP-203 Question 31

For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selection is worth one point.

DP-203 Question 31

Options:

Buy Now
Questions 32

You have an Azure subscription that contains the resources shown in the following table.

DP-203 Question 32

You need to ingest the Parquet files from storage1 to SQL1 by using pipeline1. The solution must meet the following requirements:

• Minimize complexity.

• Ensure that additional columns in the files are processed as strings.

• Ensure that files containing additional columns are processed successfully.

How should you configure pipeline1? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 32

Options:

Buy Now
Questions 33

A company has a real-time data analysis solution that is hosted on Microsoft Azure. The solution uses Azure Event Hub to ingest data and an Azure Stream Analytics cloud job to analyze the data. The cloud job is configured to use 120 Streaming Units (SU).

You need to optimize performance for the Azure Stream Analytics job.

Which two actions should you perform? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Implement event ordering.

B.

Implement Azure Stream Analytics user-defined functions (UDF).

C.

Implement query parallelization by partitioning the data output.

D.

Scale the SU count for the job up.

E.

Scale the SU count for the job down.

F.

Implement query parallelization by partitioning the data input.

Buy Now
Questions 34

You have an Azure data factory that connects to a Microsoft Purview account. The data factory is registered in Microsoft Purview.

You update a Data Factory pipeline.

You need to ensure that the updated lineage is available in Microsoft Purview.

What You have an Azure subscription that contains an Azure SQL database named DB1 and a storage account named storage1. The storage1 account contains a file named File1.txt. File1.txt contains the names of selected tables in DB1.

You need to use an Azure Synapse pipeline to copy data from the selected tables in DB1 to the files in storage1. The solution must meet the following requirements:

• The Copy activity in the pipeline must be parameterized to use the data in File1.txt to identify the source and destination of the copy.

• Copy activities must occur in parallel as often as possible.

Which two pipeline activities should you include in the pipeline? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Options:

A.

If Condition

B.

ForEach

C.

Lookup

D.

Get Metadata

Buy Now
Questions 35

You have an Azure Data Factory pipeline shown the following exhibit.

DP-203 Question 35

The execution log for the first pipeline run is shown in the following exhibit.

DP-203 Question 35

The execution log for the second pipeline run is shown in the following exhibit.

DP-203 Question 35

For each of the following statements, select Yes if the statement is true. Otherwise, select No. NOTE: Each correct selection is worth one point.

DP-203 Question 35

Options:

Buy Now
Questions 36

You are monitoring an Azure Stream Analytics job by using metrics in Azure.

You discover that during the last 12 hours, the average watermark delay is consistently greater than the configured late arrival tolerance.

What is a possible cause of this behavior?

Options:

A.

Events whose application timestamp is earlier than their arrival time by more than five minutes arrive as inputs.

B.

There are errors in the input data.

C.

The late arrival policy causes events to be dropped.

D.

The job lacks the resources to process the volume of incoming data.

Buy Now
Questions 37

You have an Azure Synapse Analytics job that uses Scala.

You need to view the status of the job.

What should you do?

Options:

A.

From Azure Monitor, run a Kusto query against the AzureDiagnostics table.

B.

From Azure Monitor, run a Kusto query against the SparkLogying1 Event.CL table.

C.

From Synapse Studio, select the workspace. From Monitor, select Apache Sparks applications.

D.

From Synapse Studio, select the workspace. From Monitor, select SQL requests.

Buy Now
Questions 38

You have an Azure data factory that has the Git repository settings shown in the following exhibit.

DP-203 Question 38

Use the drop-down menus to select the answer choose that completes each statement based on the information presented in the graphic.

NOTE: Each correct answer is worth one point.

DP-203 Question 38

Options:

Buy Now
Questions 39

You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1.

You have files that are ingested and loaded into an Azure Data Lake Storage Gen2 container named container1.

You plan to insert data from the files into Table1 and azure Data Lake Storage Gen2 container named container1.

You plan to insert data from the files into Table1 and transform the data. Each row of data in the files will produce one row in the serving layer of Table1.

You need to ensure that when the source data files are loaded to container1, the DateTime is stored as an additional column in Table1.

Solution: You use a dedicated SQL pool to create an external table that has a additional DateTime column.

Does this meet the goal?

Options:

A.

Yes

B.

No

Buy Now
Questions 40

You have an activity in an Azure Data Factory pipeline. The activity calls a stored procedure in a data warehouse in Azure Synapse Analytics and runs daily.

You need to verify the duration of the activity when it ran last.

What should you use?

Options:

A.

activity runs in Azure Monitor

B.

Activity log in Azure Synapse Analytics

C.

the sys.dm_pdw_wait_stats data management view in Azure Synapse Analytics

D.

an Azure Resource Manager template

Buy Now
Questions 41

You have an Azure subscription that is linked to a hybrid Azure Active Directory (Azure AD) tenant. The subscription contains an Azure Synapse Analytics SQL pool named Pool1.

You need to recommend an authentication solution for Pool1. The solution must support multi-factor authentication (MFA) and database-level authentication.

Which authentication solution or solutions should you include in the recommendation? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 41

Options:

Buy Now
Questions 42

You are planning a streaming data solution that will use Azure Databricks. The solution will stream sales transaction data from an online store. The solution has the following specifications:

* The output data will contain items purchased, quantity, line total sales amount, and line total tax amount.

* Line total sales amount and line total tax amount will be aggregated in Databricks.

* Sales transactions will never be updated. Instead, new rows will be added to adjust a sale.

You need to recommend an output mode for the dataset that will be processed by using Structured Streaming. The solution must minimize duplicate data.

What should you recommend?

Options:

A.

Append

B.

Update

C.

Complete

Buy Now
Questions 43

You have an Azure Stream Analytics query. The query returns a result set that contains 10,000 distinct values for a column named clusterID.

You monitor the Stream Analytics job and discover high latency.

You need to reduce the latency.

Which two actions should you perform? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Add a pass-through query.

B.

Add a temporal analytic function.

C.

Scale out the query by using PARTITION BY.

D.

Convert the query to a reference query.

E.

Increase the number of streaming units.

Buy Now
Questions 44

You use Azure Stream Analytics to receive Twitter data from Azure Event Hubs and to output the data to an Azure Blob storage account.

You need to output the count of tweets from the last five minutes every minute.

Which windowing function should you use?

Options:

A.

Sliding

B.

Session

C.

Tumbling

D.

Hopping

Buy Now
Questions 45

You have an Azure Data Lake Storage account named account1.

You use an Azure Synapse Analytics serverless SQL pool to access sales data stored in account1.

You need to create a bar chart that displays sales by product. The solution must minimize development effort.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order

DP-203 Question 45

Options:

Buy Now
Questions 46

You build an Azure Data Factory pipeline to move data from an Azure Data Lake Storage Gen2 container to a database in an Azure Synapse Analytics dedicated SQL pool.

Data in the container is stored in the following folder structure.

/in/{YYYY}/{MM}/{DD}/{HH}/{mm}

The earliest folder is /in/2021/01/01/00/00. The latest folder is /in/2021/01/15/01/45.

You need to configure a pipeline trigger to meet the following requirements:

Existing data must be loaded.

Data must be loaded every 30 minutes.

Late-arriving data of up to two minutes must he included in the load for the time at which the data should have arrived.

How should you configure the pipeline trigger? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 46

Options:

Buy Now
Questions 47

You have an Azure Blob storage account named storage! and an Azure Synapse Analytics serverless SQL pool named Pool! From Pool1., you plan to run ad-hoc queries that target storage!

You need to ensure that you can use shared access signature (SAS) authorization without defining a data source. What should you create first?

Options:

A.

a stored access policy

B.

a server-level credential

C.

a managed identity

D.

a database scoped credential

Buy Now
Questions 48

You are designing a dimension table for a data warehouse. The table will track the value of the dimension attributes over time and preserve the history of the data by adding new rows as the data changes.

Which type of slowly changing dimension (SCD) should use?

Options:

A.

Type 0

B.

Type 1

C.

Type 2

D.

Type 3

Buy Now
Questions 49

You plan to create an Azure Data Factory pipeline that will include a mapping data flow.

You have JSON data containing objects that have nested arrays.

You need to transform the JSON-formatted data into a tabular dataset. The dataset must have one tow for each item in the arrays.

Which transformation method should you use in the mapping data flow?

Options:

A.

unpivot

B.

flatten

C.

new branch

D.

alter row

Buy Now
Questions 50

You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day.

You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times.

What should you include in the solution?

Options:

A.

Partition by DateTime fields.

B.

Sink to Azure Queue storage.

C.

Include a watermark column.

D.

Use a JSON format for physical data storage.

Buy Now
Exam Code: DP-203
Exam Name: Data Engineering on Microsoft Azure
Last Update: Feb 17, 2025
Questions: 355

PDF + Testing Engine

$52.5  $174.99

Testing Engine

$40.5  $134.99
buy now DP-203 testing engine

PDF (Q&A)

$34.5  $114.99
buy now DP-203 pdf
dumpsmate guaranteed to pass
24/7 Customer Support

DumpsMate's team of experts is always available to respond your queries on exam preparation. Get professional answers on any topic of the certification syllabus. Our experts will thoroughly satisfy you.

Site Secure

mcafee secure

TESTED 22 Feb 2025