The ﬁrst analysis of any data set need not use sophisticated mathematics or statistics. The goal of these test is to ﬁnd subsets that are highly inflated due to the error or fraud.
Largest Subset Test
The largest subsets test uses two ﬁelds, one with transaction or balance numbers (such as amount, inventory counts, vote counts, population counts) and another ﬁeld to indicate the subset (e.g., vendor number, credit card number, or branch number). Subset is a group of records that have something in common.
The data can often be divided into several subset groupings. For example, accounts payable data could be grouped by vendor or by the type of purchase (purchase order, no. purchase order) or by time. There are often a few different ways that data can be divided into subsets. For inventory data the grouping could be by location. For airline ticket refunds or retail customer refunds the groupings could be the credit card that received the refund.