Next week San Francisco-based Feedzai is launching a new product that it claims can automatic processes that have historically been the purview of data scientists -- at least in the sector in which Feedzai specializes, which is fraud and risk management for financial and retail institutions.
The company promises that anyone — with just a basic understanding of computer skills — can do such tasks as create a model or define a data sample or build an algorithm or even add machine learning.
The caveat: this person must have domain expertise in risk or fraud management.
After an in-depth introduction to the application I think it has promises. It's more than hype: its features and processes can indeed guide most IT civilians in creating an advanced computing model.
But I will know for certain in a few weeks when Feedzai sets me up with my own testing environment and dataset so I can develop a model to test a theory I have about online fraud. So stay tuned for that.
Meanwhile, though, let's take a look at what Feedzai will be making available next week.
I Want to Be a Data Scientist
The name of the product is Data Science Studio and it provides an array of tools for a customer to leverage big data for its own circumstances or growth plans.
There are tools and processes for the user to create a model, define a data sample, clean the data, build algorithms, create a machine learning model if necessary, test the model and then deploy it into production.
The user is guided through these steps with a series of tools like drag and drop, clicks, command codes and so on. Baby stuff for anyone who's used Outlook, for example.
The real work comes in when developing the model.
Say a retailer or financial service provider wants to expand into a new market, such as — to pick a country — South Africa, Feedzai CMO Loc Nguyen told CMSWire.
It wants to develop security and ant-fraud processes that are specific to South Africa and its intended customer base. These processes would reflect the way business or commerce is done in that country, which could well be very different from another African nation. It also wants its likely users' habits and patterns as part of the mix.
Nguyen said going the conventional route of having such a model built, testing it, putting it into production and then possibly adding a machine learning element to it to keep it up-to-date, could take six months at best.
How It Works
The same process would only take several weeks with Data Science Studio. Here's why, or rather how.
The user has a theory for the model: most people in South African use their mobile phones for payment but only in certain parts of the country do they use it for very large payments (note to South African commerce experts, this example is made up). However the company's client base travels a lot in South Africa including to rural areas because of the development work they do. So they are more likely to legitimately use their phones in these areas.
Perhaps these clients tend to travel a lot. Then the model would reflect that when a payment is made at an airport in the US and another payment is made several hours later, it is likely fraud because this person is probably in the air at that time. And so on.
With these theories in mind, the risk manager sets about building the model. Drag and drop tools give him choices about how to incorporate, if at all, streaming data or what type of devices are most likely to be used for fraud.
The data used is provided by the client as well as from publicly-available sources.
"You create your own model for how to do business in this country," Nguyen said.
Moving to Production
One differentiator is that Data Science Studio is able to use live data in the studio model and then shift right into production. "As the commerce is happening, the data is streaming into the model to see how it works," Nguyen said.
Often that is not possible as live transaction datasets are huge and modeling engines cannot handle the volume, he said. That is a big reason why developing a model can take so many months.
Feedzai's tech stack, which includes Hadoop and Apache Cassandra, can handle large volumes of data , Nguyen said.
A White Box
Another feature built for the non-tech user is the application's White Box, which can reverse engineer a piece of code or algorithm and spit it out in semantic form. Words, that is.
It is useful in case a legitimate customer is denied access to an account for instance. The customer rep can tell the customer that his account was flagged because it is was unusual for him to be spending so much money on that particular card in that city.
Also, it gives the user a better understanding of what is happening with his model, Nguyen said.
"Otherwise we would just be letting the machines take over." And no one likes that. Just ask the data scientists who will be displaced by Feedzai's new system.
Title image by Cali4beach.