# \[DRAFT]FUNC 100: Date and Time Functions

{% file src="<https://1899859430-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FEhcgqFIfGdE0GXJzi5yR%2Fuploads%2FTpDxiS4y98FbD3CaGbGE%2Fmarketing_dataset_no_spaces.csv?alt=media&token=68dbf237-717b-4b83-92be-ece82705c692>" %}

## `hour` function

The `hour` function is used when you want to extract the hour component from a **`timestamp`** or **`datetime`** column. It's particularly useful for time-based analysis, such as:

1. **Aggregating Data by Hour**: When you need to analyze events or actions (like clicks, sales, or logins) based on the hour of the day. For example, identifying peak activity hours in a campaign.
2. **Time-of-Day Patterns**: When looking for trends or patterns in data based on the time of day. For instance, understanding what hours are most effective for sending marketing emails.
3. **Comparing Hourly Performance**: When comparing the performance of different hours within a day across multiple campaigns, as shown in the query.

{% code overflow="wrap" %}

```sql
SELECT campaign_id, hour(click_timestamp) AS hour_of_day, COUNT(*) AS total_clicks
FROM campaign_clicks
GROUP BY campaign_id, hour_of_day;
```

{% endcode %}

## `date_trunc` function

The **`date_trunc`** function is used when you want to aggregate data by a specific time interval, such as day, week, month, or year. In the provided query, **`date_trunc('month', transaction_date)`** is used to round the **`transaction_date`** down to the first day of the month, allowing you to analyze data at the monthly level. Here are some use cases for using the **`date_trunc`** function:

1. **Aggregating by Time Intervals**: When you need to summarize data over consistent time periods, such as months, quarters, or years. This is useful for time series analysis, trend detection, or reporting.
2. **Monthly or Periodic Reporting**: When generating monthly reports to summarize key metrics (e.g., total revenue, number of transactions) for each month.
3. **Smoothing Time-Series Data**: When you want to eliminate daily fluctuations by summarizing data into larger time buckets, such as weeks or months, to better understand long-term trends.
4. **Comparing Performance Across Periods**: When comparing metrics across different time intervals, like comparing revenue month-over-month.

&#x20;The syntax for the **`date_trunc`** function is as follows:

```sql
date_trunc('unit', date)
```

* **`unit`**: This specifies the level of truncation and can be values like `'year'`, `'quarter'`, `'month'`, `'week'`, `'day'`, `'hour'`, `'minute'`, or `'second'`.
* **`date`**: The date or timestamp expression that you want to truncate.

{% code overflow="wrap" %}

```sql
SELECT date_trunc('month', transaction_date) AS month, SUM(revenue) AS total_revenue
FROM transactions
GROUP BY date_trunc('month', transaction_date)
ORDER BY month;
```

{% endcode %}

## `year` function

The **`year`** function in this query extracts the year from the **`signup_date`** field, allowing you to group and analyze data on an annual basis. Here are some situations where using the `year` function is beneficial:

1. **Yearly Aggregation**: Useful for grouping data by year to summarize activities or events that occurred within each year. In the example below, it counts the number of customer signups per year.
2. **Cohort Analysis**: Helps in tracking groups of customers who signed up in the same year, providing insights into customer behavior, growth trends, or retention over time.
3. **Year-over-Year Comparisons**: Facilitates comparisons across different years, such as assessing revenue growth, user acquisition, or other key metrics.
4. **Trend Analysis**: Useful for identifying patterns or trends over multiple years, such as determining which years had peak or low signup activity.

{% code overflow="wrap" %}

```sql
SELECT year(signup_date) AS signup_year, COUNT(customer_id) AS cohort_size
FROM customers
GROUP BY year(signup_date)
ORDER BY signup_year;
```

{% endcode %}

## `dayofweek` function

The **`dayofweek`** function is useful for:

1. **Grouping Data by Day of the Week**: It allows you to analyze trends or patterns based on the day, such as identifying which days have higher sales or more website traffic.
2. **Classifying Days as Weekend or Weekday**: As shown in the example, you can use `dayofweek` to categorize days into "Weekend" or "weekday" for analysis.
3. **Scheduling and Planning**: When analyzing tasks or events based on the day of the week, this function helps in scheduling resources more efficiently.

{% code overflow="wrap" %}

```sql
SELECT CASE
         WHEN dayofweek(transaction_date) IN (1, 7) THEN 'Weekend'
         ELSE 'Weekday'
       END AS day_type,
       SUM(revenue) AS total_revenue
FROM transactions
GROUP BY day_type;
```

{% endcode %}

## `datediff` function

The **`datediff`** function is used to calculate the difference between two dates, typically returning the result as the number of days between them. In the context of the provided query, **`datediff`** is being used to determine the number of days between consecutive purchase dates for each customer.

{% code overflow="wrap" %}

```sql
SELECT customer_id, avg(datediff(purchase_date, lag(purchase_date) OVER (PARTITION BY customer_id ORDER BY purchase_date))) AS avg_days_between_purchases
FROM purchases;
```

{% endcode %}

Here's a breakdown of the query above and the use of **`datediff`**:

1. **Calculating Differences Between Consecutive Dates**: The **`datediff`** function computes the difference in days between a **`purchase_date`** and the previous **`purchase_date`** for the same customer, as determined by the **`lag`** function.
2. **Using `lag` Function**: The **`lag(purchase_date)`** function retrieves the previous purchase date for each **`customer_id`**, allowing you to compare it with the current **`purchase_date`**.
3. **Grouping by Customer**: The **`PARTITION BY customer_id`** clause ensures that the calculations are performed separately for each customer, allowing you to analyze individual purchasing patterns.
4. **Averaging the Day Differences**: The **`avg`** function calculates the average number of days between purchases for each customer, providing insight into their purchase frequency.

## **`current_date` function**

&#x20;Here’s a breakdown of the usage:

1. **Filtering Data for Today's Date**: The query retrieves all customers who signed up on the current date by comparing the **`signup_date`** to **`current_date()`**. This helps identify new signups that occurred today.
2. **Use Cases for `current_date()`**:
   * **Daily Reports**: Generating reports that focus on today's activities, such as new signups, sales, or customer interactions.
   * **Real-Time Monitoring**: Tracking metrics that need to be updated continuously, like daily active users or same-day transactions.
   * **Scheduled Queries**: Running automated tasks or queries that process data based on the current date.

The **`current_date()`** function is used to get the current date (today's date) in SQL. In the given query, it is used to filter records where the **`signup_date`** matches today's date.

```sql
SELECT customer_id, signup_date
FROM customer_activity_data
WHERE signup_date = current_date();
```

## **`current_timestamp` function**

&#x20;Here’s a breakdown of its use:

1. **Capturing the Exact Interaction Time**: By using `current_timestamp()`, you record the precise moment when the interaction took place. This is useful for time-sensitive data tracking, such as logging user actions or events.
2. **Use Cases for `current_timestamp()`**:
   * **Event Logging**: Recording the exact time of events, such as user interactions, system events, or changes in status.
   * **Audit Trails**: Keeping a detailed log of activities for compliance, debugging, or tracking user behavior over time.
   * **Real-Time Analytics**: Analyzing data based on the exact time of occurrence, which is helpful for real-time dashboards or time-series analysis.

The **`current_timestamp()`** function is used below to get the current date and time (timestamp) at the moment the query is executed. In the given **`INSERT`** statement, it adds a record to the `campaign_interactions` table with the exact time when the insertion occurs.

```sql
INSERT INTO campaign_interactions (customer_id, campaign_id, interaction_time)
VALUES (1234, 5678, current_timestamp());
```

## **`current_timezone` function**

Here are the use cases:

* **Tracking Data Entry Timezone**: This could be used to log the timezone in which the data entry occurred, particularly useful in multi-regional systems where data might be inserted from various geographical locations.
* **Localization of Campaign Analytics**: When analyzing campaign interactions, knowing the timezone helps localize data for regional reports. It would enable the conversion of timestamps to the local time of the interaction, giving a more accurate representation of when customers interacted with campaigns.
* **Timezone-Based Personalization**: If the system's timezone reflects the user's local time, you could use this data for personalized marketing. For example, sending notifications at specific times based on each user's local timezone.
* **Debugging and Audit Trails**: In systems where data ingestion and interaction logs come from various regions, capturing the current timezone during data entry could help troubleshoot issues, understand latency, or provide insights into data processing across time zones.
* **Data Synchronization Across Regions**: In distributed systems, knowing the current timezone for data entries could aid in synchronizing data across servers or applications located in different time zones.

```sql
SELECT customer_id, current_timezone() AS customer_timezone
FROM campaign_interactions;
```

## **`date` function**

```sql
SELECT customer_id, date(click_timestamp) AS click_date
FROM customer_activity_data;
```

## **`date_add` function**

```sql
SELECT customer_id, last_interaction_date, date_add(last_interaction_date, 7) AS predicted_next_interaction
FROM customer_activity_data;
```

## **`date_diff` function**

```sql
SELECT customer_id, date_diff(current_date(), last_purchase_date) AS inactivity_days
FROM customer_activity_data;
```

## **`date_format` function**

```sql
SELECT customer_id, date_format(transaction_date, 'MMMM yyyy') AS transaction_month
FROM customer_activity_data;
```

## **`date_from_unix_date` function**

```sql
SELECT customer_id, date_from_unix_date(unix_timestamp) AS readable_date
FROM customer_activity_data;
```

## **`hour` function**

```sql
SELECT customer_id, hour(click_timestamp) AS hour_of_day, COUNT(*) AS total_clicks
FROM customer_activity_data
GROUP BY customer_id, hour_of_day;
```

## **`last_day` function**

```sql
SELECT customer_id, last_day(subscription_start_date) AS subscription_end_date
FROM customer_activity_data;
```

## **`make_date` function**

```sql
SELECT make_date(2024, 12, 25) AS campaign_start_date;
```

## **`month` function**

```sql
SELECT month(transaction_date) AS transaction_month, SUM(revenue) AS total_revenue
FROM customer_activity_data
GROUP BY transaction_month;
```

## **`months_between` function**

{% code overflow="wrap" %}

```sql
SELECT customer_id, months_between(last_purchase_date, signup_date) AS months_between_purchases
FROM customer_activity_data;
```

{% endcode %}

## **`next_day` function**

```sql
SELECT customer_id, next_day(last_interaction_date, 'Monday') AS follow_up_date
FROM customer_activity_data;
```

## **`minute` function**

```sql
SELECT customer_id, minute(click_timestamp) AS minute_of_interaction, COUNT(*) AS total_clicks
FROM customer_activity_data
GROUP BY customer_id, minute_of_interaction;
```

## **`second` function**

```sql
SELECT customer_id, second(click_timestamp) AS second_of_interaction
FROM customer_activity_data;
```

## **`timediff` function**

{% code overflow="wrap" %}

```sql
SELECT customer_id, timediff(last_interaction_date, first_interaction_date) AS time_spent
FROM customer_activity_data;
```

{% endcode %}

## **`timestamp` function**

```sql
SELECT timestamp('2024-12-31 23:59:59') AS campaign_end_timestamp;
```

## **`timestamp_micros` function**

```sql
SELECT timestamp_micros(1696843573000000) AS event_timestamp;
```

## **`timestamp_millis` function**

```sql
SELECT timestamp_millis(1696843573000) AS event_timestamp;
```

## **`timestamp_seconds` function**

```sql
SELECT timestamp_seconds(1696843573) AS event_timestamp;
```

## **`timestampadd` function**

{% code overflow="wrap" %}

```sql
SELECT customer_id, timestampadd(MINUTE, 30, click_timestamp) AS predicted_purchase_time
FROM customer_activity_data;
```

{% endcode %}

## **`timestampdiff` function**

{% code overflow="wrap" %}

```sql
SELECT customer_id, timestampdiff(HOUR, first_interaction_date, last_interaction_date) AS hours_between_interactions
FROM customer_activity_data;
```

{% endcode %}

## **`date_part` function**

```sql
SELECT customer_id, date_part('day', transaction_date) AS purchase_day
FROM customer_activity_data;
```

## **`to_date` function**

```sql
SELECT to_date('2024-12-31', 'yyyy-MM-dd') AS campaign_launch_date;
```

## **`to_timestamp` function**

{% code overflow="wrap" %}

```sql
SELECT to_timestamp('2024-12-31 23:59:59', 'yyyy-MM-dd HH:mm:ss') AS campaign_end_timestamp;
```

{% endcode %}

## **`to_unix_timestamp` function**

{% code overflow="wrap" %}

```sql
SELECT to_unix_timestamp('2024-12-31 23:59:59', 'yyyy-MM-dd HH:mm:ss') AS unix_timestamp;
```

{% endcode %}

## **`to_utc_timestamp` function**

```sql
SELECT to_utc_timestamp(click_timestamp, 'America/Los_Angeles') AS utc_click_time
FROM customer_activity_data;
```

## **`year` function**

```sql
SELECT year(transaction_date) AS transaction_year, SUM(revenue) AS total_revenue
FROM customer_activity_data
GROUP BY transaction_year;
```

## **`date_sub` function**

```sql
SELECT customer_id, date_sub(event_date, 7) AS reminder_date
FROM customer_activity_data;
```

## **`date_trunc` function**

{% code overflow="wrap" %}

```sql
SELECT date_trunc('month', transaction_date) AS transaction_month, SUM(revenue) AS total_revenue
FROM customer_activity_data
GROUP BY transaction_month;
```

{% endcode %}

## **`dateadd` function**

```sql
SELECT customer_id, dateadd(MONTH, 1, subscription_start_date) AS next_billing_date
FROM customer_activity_data;
```

## **`datediff` function**

{% code overflow="wrap" %}

```sql
SELECT customer_id, datediff(current_date(), last_interaction_date) AS inactivity_days
FROM customer_activity_data;
```

{% endcode %}

## **`day` function**

```sql
SELECT day(transaction_date) AS transaction_day, COUNT(*) AS total_transactions
FROM customer_activity_data
GROUP BY transaction_day;
```

## **`dayofmonth` function**

{% code overflow="wrap" %}

```sql
SELECT dayofmonth(transaction_date) AS transaction_day_of_month, COUNT(*) AS total_transactions
FROM customer_activity_data
GROUP BY transaction_day_of_month;
```

{% endcode %}

## **`dayofweek` function**

```sql
SELECT dayofweek(click_timestamp) AS engagement_day, COUNT(*) AS total_engagements
FROM customer_activity_data
GROUP BY engagement_day;
```

## **`dayofyear` function**

{% code overflow="wrap" %}

```sql
SELECT dayofyear(transaction_date) AS transaction_day_of_year, COUNT(*) AS total_transactions
FROM customer_activity_data
GROUP BY transaction_day_of_year;
```

{% endcode %}
