# PREP 501: Ingesting JSON Test Data into Adobe Experience Platform

In this tutorial, we will learn how to ingest test data especially nested data into the Platform. You will need this in order to do your Data Distiller modules.

## Prerequisites

You need to setup DBVisualizer:

{% content-ref url="unit-1-getting-started/prep-400-dbvisualizer-sql-editor-setup-for-data-distiller" %}
[prep-400-dbvisualizer-sql-editor-setup-for-data-distiller](https://data-distilller.gitbook.io/adobe-data-distiller-guide/unit-1-getting-started/prep-400-dbvisualizer-sql-editor-setup-for-data-distiller)
{% endcontent-ref %}

You will need to download this JSON file. Extract the zip and copy the JSON file over:

{% file src="<https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FC9PVHWEfWx4fZONb7cGT%2FLuma-web-data.json.zip?alt=media&token=f00987d7-1589-4897-ae71-1faa8358943b>" %}

## **Scenario**

We are going to ingest LUMA data into our test environment. This is a [fictitious online store ](https://luma.enablementadobe.com/content/luma/us/en.html)created by Adobe

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2F6truaKqY41U0A2QO8pze%2FScreen%20Shot%202023-08-28%20at%208.50.53%20AM.png?alt=media&#x26;token=7ab53270-4e4a-4dab-856a-119b6edd1f35" alt=""><figcaption><p>Luma website</p></figcaption></figure>

The fastest way to understand what is happening on the website is to check the Products tab. There are 3 categories of products for different (and all) personas. You can browse them. You authenticate yourself and also can add items to a cart. The data that we are ingesting into the Platform is the test website traffic data that conforms to the Adobe Analytics schema.&#x20;

Unlike the Movie Genre Targeting example where we simply dropped a CSV file and the data popped out as a dataset, we cannot do the same with JSON files as we need to specify the nested schema for the system to understand the schema of the data.&#x20;

## Setup Azure Storage Explorer

1. We will be using an interesting technique to ingest this data which will also form the basis of simulating batch ingestion. Download the Azure Storage Explorer from this[ link](https://azure.microsoft.com/en-us/blog/microsoft-azure-data-lake-storage-adls-in-storage-explorer-public-preview/). Make sure you download the right version based on your OS and install it.&#x20;
2. We will be using Azure Storage Explorer as a local file browser to upload files into AEP's Landing Zone: Azure-based blob storage that stays outside AEP's governance boundary. The Landing Zone is a TTL for data for 7 days and serves as a mechanism for teams to push data asynchronously into this staging zone prior to ingestion. It also is a fantastic tool for testing the ingestion of test data into AEP.&#x20;
3. In the Azure Storage Explorer, open up the Connect Dialog by clicking the plug icon and then click on the ADLSGen2 container or directory option:

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FpOt69YdL13RmKqYBj2RW%2FScreen%20Shot%202023-06-02%20at%2011.28.19%20AM.png?alt=media&#x26;token=5dbd7ae2-77b6-42f9-9c9a-1011f957ff04" alt=""><figcaption></figcaption></figure>

5. Choose the connection type as **Shared SAS URL**. What this means is that if there are multiple users who have access to the Landing Zone URL, they could all write over each other. If you are seeking isolation, it is only available at the sandbox level. There is one Landing Zone per sandbox.&#x20;

   <figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2F6j5zYBvZFq7hytuzMDGk%2FScreen%20Shot%202023-06-02%20at%2011.29.36%20AM.png?alt=media&#x26;token=91ecca56-c702-4da4-a772-8d3e47bdfe33" alt=""><figcaption></figcaption></figure>
6. Name the container and then add the credentials by going into **Adobe Experience Platform->Sources->Data Landing Zone**.

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2F5GoK8SOvnikgU9IjEbbK%2FScreen%20Shot%202023-08-28%20at%209.00.27%20AM.png?alt=media&#x26;token=fa41bc09-2141-48a1-8453-bcd5f2fa9dd4" alt=""><figcaption></figcaption></figure>

7. Now go into **Adobe Experience Platform UI->Sources->Catalog->Cloud Storage->Data Landing Zone** and **View Credentials**:

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FqL3l5GJzbGZyGnqf0DdK%2FScreen%20Shot%202023-08-28%20at%207.30.10%20AM.png?alt=media&#x26;token=095a23c5-f597-4c1c-a615-1fa1417747ab" alt=""><figcaption></figcaption></figure>

8. If you click on the View Credentials, you should get this screen. Click to copy the ***SAS Uri***

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FMGyM8Xm2l0Oj5xxOldMX%2FScreen%20Shot%202023-08-28%20at%209.04.56%20AM.png?alt=media&#x26;token=6ff1da3c-3d21-4d86-8976-2c442106245f" alt=""><figcaption></figcaption></figure>

9. Copy the **SAS URI** into the Storage Explorer Account setup:

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FFHeXNuGBYyiSeJWVNuSO%2FScreen%20Shot%202023-08-28%20at%209.10.14%20AM.png?alt=media&#x26;token=2f1df2ef-eb71-4f07-b919-2027b76d0a02" alt=""><figcaption></figcaption></figure>

10. Click next to complete the setup:

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2F2tBnNPpPPoRNRYjgAyxq%2FScreen%20Shot%202023-08-28%20at%209.13.46%20AM.png?alt=media&#x26;token=d88bfd2f-0a53-4a36-b2e1-4865b1f96038" alt=""><figcaption></figcaption></figure>

11. The screen will look like the following. Either **drag and drop** the JSON file or **Upload**:

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2F5hBePMEOHLxK9i4jS1Jn%2FScreen%20Shot%202023-08-28%20at%209.16.39%20AM.png?alt=media&#x26;token=14da5aed-395d-4615-a09f-43eca5b91f9e" alt=""><figcaption></figcaption></figure>

12. Navigate to **Adobe Experience Platform UI->Sources->Catalog->Cloud Storage->Data Landing Zone**. You will either see **Add Data** or **Setup** button on the card itself. Click it to access the data landing Zone.&#x20;

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FhaAbRTx2qQatFLdh3B1H%2FScreen%20Shot%202023-08-28%20at%209.20.02%20AM.png?alt=media&#x26;token=a6d664e1-72d3-4548-bbaa-e746f0d4f426" alt=""><figcaption></figcaption></figure>

13. Voila! You should now see the JSON file you uploaded. You will also be able to preview the first 8 to 10 records (top of the file) as well. These records will be used for validating our pipeline for ingestion later.&#x20;

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FEaN6o0Di053m34kT33AR%2FScreen%20Shot%202023-08-28%20at%208.13.47%20AM.png?alt=media&#x26;token=6de052ec-2fa1-460d-97a7-094c5a8225c6" alt=""><figcaption></figcaption></figure>

## Create a XDM Schema

1. Create a XDM Schema by going to **Adobe Experience Platform UI->Schemas->Create XDM Experience Event**

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FJMIaK4oXf2UaNyifgqzh%2FScreen%20Shot%202023-08-28%20at%209.31.16%20AM.png?alt=media&#x26;token=8a77a37e-8897-4303-b4a8-730654cb3588" alt=""><figcaption></figcaption></figure>

2. On the Schema screen, click on the pane for **Field groups->Add**

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FLGAypt0kWsecc0MJwOMd%2FScreen%20Shot%202023-08-28%20at%209.36.02%20AM.png?alt=media&#x26;token=1fb71895-7522-4254-9b3c-227bf97af1e2" alt=""><figcaption></figcaption></figure>

3. Search for "Adobe Analytics" as a term for Field Groups:

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2F1kFXLFoITwf9Z6oogegx%2FScreen%20Shot%202023-08-28%20at%209.32.17%20AM.png?alt=media&#x26;token=746701ce-30c8-404e-bcbf-db5674bbb83f" alt=""><figcaption></figcaption></figure>

4. Add **Adobe Analytics ExperienceEvent Template** field group. This is a comprehensive field group but we will be using a portion of all the fields.&#x20;

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FUObVS7WjyhEOIvhlEkTU%2FScreen%20Shot%202023-08-28%20at%209.32.28%20AM.png?alt=media&#x26;token=591b6d43-af34-428b-a008-a2d8f9d7b7c5" alt=""><figcaption></figcaption></figure>

5. Save the schema as **Luma Web Data.**

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FOoY9dPAb3VMo2o0iWICm%2FScreen%20Shot%202023-08-28%20at%209.33.16%20AM.png?alt=media&#x26;token=0f76e656-65cb-40af-bc0d-8e3fecf0d8c1" alt=""><figcaption></figcaption></figure>

## Ingest Data from Data Landing Zone

1. Click on the XDM Compliant dropdown and change it to **Yes**:

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FbxCQwpEbm6BlkxRiH4fk%2FScreen%20Shot%202023-08-28%20at%209.27.10%20AM.png?alt=media&#x26;token=41fda65e-0d81-4cf1-8f69-eba489ec86f2" alt=""><figcaption></figcaption></figure>

2. Go to the next screen and fill out the details as exactly shown in the screen below. Name the dataset as **luma\_web\_data,** choose the **Luma Web Dataset** schema, **and** enable **Partial Ingestion.**

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2Fjh2UGXsY7d4Ein4sqDSq%2FScreen%20Shot%202023-08-28%20at%2011.42.02%20AM.png?alt=media&#x26;token=46aa632d-0dd8-49d7-8d8c-38c6f6ef7f4b" alt=""><figcaption></figcaption></figure>

3. Configure the **Scheduling** to **Minute** and for every **15 minutes.**&#x20;

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FXvsGfKpw01ygLOIZFE4S%2FScreen%20Shot%202023-08-28%20at%208.14.35%20AM.png?alt=media&#x26;token=272f5724-1585-4263-8caa-93c666367967" alt=""><figcaption></figcaption></figure>

4. Click **Next** and **Finish**. Your dataflow should execute and you should see the dataset **luma\_web\_data** in **Adobe Experience Platform UI->Datasets.** Click on the dataset **luma\_web\_data.** You should see about 733K records ingested.

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FWxgwMUpmJRWMAtdnXnBH%2FScreen%20Shot%202023-08-28%20at%2011.47.26%20AM.png?alt=media&#x26;token=1c1b2fe0-37f6-462f-a1b0-5eadf04baab1" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
**Note:** By marking the dataset as XDM compatible in the dataflow step, we avoided having to go through a mapping process. I was able to choose XDM compatible because the Adobe Analytics schema I chose was a superset of the Luma schema. There is no point in me doing a manual mapping. If you are bringing in Adobe Analytics data in practice, you may not be this lucky as you will need to account for eVars and will need to do the mapping. That is beyond the scope of this guide.
{% endhint %}

## Query the Data

1. The first query that you can type is:

```sql
select * from luma_web_data;
```

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2F2kV7S66Dwm3UNfHs2mz5%2FScreen%20Shot%202023-08-28%20at%2011.56.10%20AM.png?alt=media&#x26;token=b211d617-8ee1-4f06-9e5b-2cf730032036" alt=""><figcaption><p>luma_web_data 50,000 results. </p></figcaption></figure>

2. To get 50,000 results, you need to configure DBvIsualizer.

{% content-ref url="unit-1-getting-started/prep-400-dbvisualizer-sql-editor-setup-for-data-distiller" %}
[prep-400-dbvisualizer-sql-editor-setup-for-data-distiller](https://data-distilller.gitbook.io/adobe-data-distiller-guide/unit-1-getting-started/prep-400-dbvisualizer-sql-editor-setup-for-data-distiller)
{% endcontent-ref %}

3. If you need to query the complex object, say, for the web object, use the `to_json` construct

```sql
select to_json(web) from luma_web_data;
```

<figure><img src="https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FMsSfE3YDdVUQwJFkBEC1%2FScreen%20Shot%202023-08-28%20at%2011.58.48%20AM.png?alt=media&#x26;token=6dba509d-75c2-4a43-ab81-ed7268865a3f" alt=""><figcaption></figcaption></figure>
