I was the victim of credit card fraud in 2012. Criminals skimmed my debit card and emptied the bank account in a matter of hours. My bank fully refunded me but when checking the transactions, I figured that this was preventable.

picture of an airplane

Fraud Case

This is the original transaction list I received from my bank. The first transaction at 02:47 was an ATM withdrawal by me and this wasn’t necessarily the machine where my card data was stolen. Later the fraudulent transactions start from Toronto Canada and Cancun Mexico.

GA-Verfügung - Customer is at an ATM machine (German: Geldautomat) and has removed the money

EMV - Europay, Mastercard or Visa Authorization with PIN

EC-GA MAGNET0 - Authorization via the magnetic strip of the card

Date UTC Transaction Text Value
08.05.2012-02:47 GA Verfügung EMV GEB.EU 5,99 EU 100,00 BERLIN Backshop GER € -105,99
08.05.2012-14:30 GA-Verfügung EC-GA MAGNET0 GEB.EU 7,50 CAD 300 TORONTO MCU WELLESLEY B CAN € -239,41
08.05.2012-15:03 GA-Verfügung EC-GA MAGNET0 GEB.EU 0,00 MXN 363,60 CANCUN TKUKULKAN K MEXGA MXN € -21,60
08.05.2012-15:31 GA-Verfügung EC-GA MAGNET0 GEB.EU 7,50 CAD 500 TORONTO MCU WELLESLEY B CAN € -394,01
08.05.2012-15:03 GA-Verfügung EC-GA MAGNET0 GEB.EU 0,00 MXN 657,60 CANCUN TKUKULKAN K MEXGA MXN € -39,06
08.05.2012-15:16 GA-Verfügung EC-GA MAGNET0 GEB.EU 0,00 MXN 7419,60 CANCUN TKUKULKAN K MEXGA MXN € -440,68
08.05.2012-17:31 GA-Verfügung EC-GA MAGNET0 GEB.EU 7,50 CAD 280 TORONTO MCU WELLESLEY B CAN € -222,56

This continued until my bank account was empty.

Detection

Banks verify your account balance almost in real time on every withdrawal to prevent you from breaking your credit limits. The information the bank receives from payment networks or vendor terminals has the same content as the transaction list above. During this check, the following fraud detection algorithm could be implemented to block the fraudulent transactions and prevent the money withdrawal, lock the compromised card or ask the user for additional verification via a mobile app.

How to detect

Calculate the distance between two given cities and make sure it’s realistic to travel the distance in the given time via airplane.

picture of an airplane

Implementation

Get the Latitude and Longitude of a given City/Country for example via python geopy or SimpleMaps.

City latitude lonitude
Berlin 52.516667 13.4
Cancun 21.172361 -86.829546
Toronto 43.666667 -79.416667

With the Haversine Formula, we can calculate the distance on a sphere with a radius of 6371 km (Earth).

a = sin²(Δφ/2) + cos φ1 ⋅ cos φ2 ⋅ sin²(Δλ/2)
c = 2 * atan2( √a, √(1−a) )
d = R * c

Where φ represent the latitudes, and λ represent the longitudes. We then need to calculate the delta between the time of the two transactions and calculate if an airplane (max 925kph/0.26kps) could actually make it.

#!/usr/bin/python3

import argparse, sys, datetime
from math import sin, cos, sqrt, atan2, radians

def calc_dist(pos1, pos2):
    R = 6371
    lat1 = radians(float(pos1.split(":")[0]))
    lon1 = radians(float(pos1.split(":")[1]))
    lat2 = radians(float(pos2.split(":")[0]))
    lon2 = radians(float(pos2.split(":")[1]))
    dlon = lon2-lon1
    dlat = lat2-lat1
    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))
    distance = R * c
    return(distance)

def calc_time(time1, time2):
    tf = "%Y%m%dT%H%M%S"
    t1 = datetime.datetime.strptime(time1, tf)
    t2 = datetime.datetime.strptime(time2, tf)
    time_delta = t2 - t1
    return(time_delta.total_seconds())

def kph_to_kps(kph):
    return(float(kph)/3600)

def calc_realistic_dist(kps, time_delta):
    return(kps*time_delta)

def possible(distance, realistic_distance):
    if distance > realistic_distance:
        return("nope")
    else:
        return("yes")

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("-s", help="max speed <int> kph; default=925", action="store", dest="speed", type=int, default=925)
    parser.add_argument("pos1", type=str, help="latitude and longitude position 1: <52.443099:13.622028>")
    parser.add_argument("pos2", type=str, help="latitude and longitude position 2: <52.522613:13.404837>")
    parser.add_argument("time1", type=str, help="ISO8601 timestamp 1: <20190101T172215>")
    parser.add_argument("time2", type=str, help="ISO8601 timestamp 2: <20190101T172265>")
    args = parser.parse_args()
    if not args.pos1 and args.pos2 and args.time1 and args.time2:
        parser.print_help()
        sys.exit(-1)
    distance = calc_dist(args.pos1,args.pos2)
    time_delta = calc_time(args.time1,args.time2)
    kps = kph_to_kps(args.speed)
    realistic_distance = calc_realistic_dist(kps, time_delta)
    print("distance pos1 and pos2: " +  str(round(distance,2)) + " km")
    print("kilometer per second: " + str(round(kps,2)) + " kps")
    print("delta time1 and time2: " + str(time_delta) + " seconds")
    print("possible: " + str(possible(distance, realistic_distance)))

if  __name__ == "__main__":
    main()

Let’s try my MVP with the first transaction.

08.05.2012-02:47 Berlin 52.516667 13.4
08.05.2012-14:30 Toronto 43.666667 -79.416667
python ./algo-mvp.py 52.516667:13.4 43.666667:-79.416667 20120508T024700 20120508T143000
distance pos1 and pos2: 6476.68 km
kilometer per second: 0.26 kps
delta time1 and time2: 42180.0 seconds
possible: yes

So this is a possible transaction. Let’s check the next pair.

08.05.2012-14:30 Toronto 43.666667 -79.416667
08.05.2012-15:03 Cancun 21.172361 -86.829546
python ./algo-mvp.py 43.666667:-79.416667 21.172361:-86.829546 20120508T143000 20120508T150300
distance pos1 and pos2: 2593.52 km
kilometer per second: 0.26 kps
delta time1 and time2: 1980.0 seconds
possible: nope

At this point, the fraud detection would report an issue and the bank could decline the transaction. The damage to the bank would be just the first transaction instead of multiple ones and the damage of thousands of Euros.

Production

I would utilize a microservice to run the detection. The decision regarding accepting or declining a transaction has to be made quickly. I would optimize the code for this and utilize a Redis based cache to hold the time and lat/long pair of the last transactions of all customers. A TTL of one day should be enough because in one day our airplane can go anywhere on this planet. Military personnel traveling by fighter jets will, unfortunately, trigger the fraud implementation.

Conclusion

Fraud detection requires many approaches and is highly depending on the use case. It’s important to look at fraud cases and analyze them. Nowadays hopefully all banks utilize machine learning to get a good understanding of their customers behavior and flag potential fraud in real-time. Batch processing once per day is no longer acceptable in my opinion. Customers deserve to be protected by modern real time fraud detection technology.

My approach is simple and can help to prevent fraud in cases of unrealistic distances between transactions.