Identifying Accounts With Only Withdrawn Transactions Within a Specific Time Period Using SQL

Grouping Transactions by Account Type and Time Period

Understanding the Problem Statement

In this article, we will explore a common database query problem involving grouping transactions by account type and time period. We will break down the problem statement, analyze the requirements, and provide a step-by-step solution using SQL.

The problem revolves around a transaction table that contains information about deposits and withdrawals made by different accounts over various dates. The goal is to identify which accounts have only withdrawn money but have not deposited any money within a specific time duration.

Database Schema and Sample Data

For this example, let’s assume we have the following database schema:

CREATE TABLE transaction_table (
    acct_id VARCHAR(255) PRIMARY KEY,
    trans_type VARCHAR(255),
    date_of_tran DATE
);

Here is some sample data to illustrate the problem:

acct_idtrans_typedate_of_tran
AWITHDRAW2021-Jun-20
ADEPOSIT2021-Jun-21
ADEPOSIT2021-Jun-22
BWITHDRAW2021-Jun-20
BWITHDRAW2021-Jun-21
CDEPOSIT2021-Jun-20
CDEPOSIT2021-Jun-21
CWITHDRAW2021-Jun-21
CWITHDRAW2021-Jun-21

Breaking Down the Problem Statement

To solve this problem, we need to break it down into smaller components:

  1. Filter out accounts that have both withdrawn and deposited transactions.
  2. Identify which of these accounts only have withdrawn transactions within a specific time period.

Solution Overview

We will use SQL to solve this problem. The main concepts involved are:

  • Using subqueries with NOT EXISTS to filter rows based on the presence or absence of certain conditions.
  • Joining tables using the IN operator to compare values between two columns.

Step 1: Filter Out Accounts That Have Both Withdrawn and Deposited Transactions

We can start by filtering out accounts that have both withdrawn and deposited transactions. We will use a subquery with NOT EXISTS to achieve this:

SELECT * FROM transaction_table tt
WHERE trans_type = 'WITHDRAW'
AND NOT EXISTS (
    SELECT 1 
    FROM transaction_table tt1 
    WHERE tt1.acct_id = tt.acct_id 
    AND trans_type = 'DEPOSIT' 
    AND date_of_tran >= (SELECT MIN(date_of tran) FROM transaction_table tt2 WHERE tt2.acct_id = tt.acct_id)
);

However, this query may not be efficient as it scans the entire table twice. A better approach would be to use a GROUP BY clause with aggregate functions:

SELECT * FROM (
    SELECT acct_id, trans_type, date_of_tran,
           COUNT(CASE WHEN trans_type = 'DEPOSIT' THEN 1 END) OVER (PARTITION BY acct_id ORDER BY date_of_tran) AS deposit_count
    FROM transaction_table
) tt
WHERE trans_type = 'WITHDRAW'
AND deposit_count = 0;

Step 2: Identify Which Accounts Only Have Withdrawn Transactions Within a Specific Time Period

Now that we have filtered out accounts with both withdrawn and deposited transactions, we can focus on identifying which of these accounts only have withdrawn transactions within the desired time period.

We will use another subquery with NOT EXISTS to achieve this:

SELECT * FROM transaction_table tt
WHERE trans_type = 'WITHDRAW'
AND NOT EXISTS (
    SELECT 1 
    FROM transaction_table tt1 
    WHERE tt1.acct_id = tt.acct_id 
    AND date_of_tran >= (SELECT MIN(date_of tran) FROM transaction_table tt2 WHERE tt2.acct_id = tt.acct_id)
    AND date_of_tran <= (SELECT MAX(date_of tran) FROM transaction_table tt2 WHERE tt2.acct_id = tt.acct_id)
);

However, this query may still not be efficient as it scans the entire table multiple times. A better approach would be to use a single pass through the data:

WITH withdrawn_transactions AS (
    SELECT acct_id, trans_type, date_of_tran,
           COUNT(CASE WHEN trans_type = 'DEPOSIT' THEN 1 END) OVER (PARTITION BY acct_id ORDER BY date_of_tran) AS deposit_count
    FROM transaction_table
),
withdrawn_within_period AS (
    SELECT *
    FROM withdrawn_transactions
    WHERE trans_type = 'WITHDRAW'
    AND deposit_count = 0
)
SELECT * FROM withdrawn_within_period;

Conclusion

In this article, we explored a common database query problem involving grouping transactions by account type and time period. We broke down the problem statement into smaller components and provided step-by-step solutions using SQL.

We used aggregate functions with GROUP BY clauses to filter out accounts with both withdrawn and deposited transactions, and then used subqueries with NOT EXISTS to identify which of these accounts only have withdrawn transactions within a specific time period.


Last modified on 2024-07-14