How mobile operators analyze our data |
Mobile operators receive a lot of data and metadata, which can be used to learn a lot about the life of an individual subscriber. And by understanding how this data is processed and stored, you will be able to track the entire chain of information passing from the call to debiting money. If we talk about the internal intruder model, then the possibilities are even more enormous, because data protection is not at all part of the tasks of the mobile operator's pre-billing systems. First of all, you need to take into account that subscriber traffic in the telecom operator's network is generated and comes from different equipment. This equipment can generate files with records (CDR files, RADIUS logs, ASCII text) and work using different protocols (NetFlow, SNMP, SOAP). And you need to control all this fun and unfriendly round dance, take data, process it and transfer it further to the billing system in a format that will be pre-standardized. At the same time, subscriber data is running everywhere, which it is advisable not to provide access to outsiders. How secure is the information in such a system, taking into account all the chains? Let's figure it out. Why do mobile operators need pre-billing? With the help of pre-billing, various data reconciliation and reloading are implemented. For example, reconciliation of the status of services on equipment and in billing. It happens that a subscriber uses the services despite the fact that he is already blocked in the billing. Or he used the services, but there was no record of this from the equipment. There can be many situations, most of these moments are solved with the help of pre-billing. I once wrote a term paper on optimizing a company's business processes and calculating ROI. The problem with calculating ROI was not that there was no source data — I did not understand which "ruler" to measure them with. About the same thing often happens with pre-billing. You can endlessly customize and improve the processing, but always at some point circumstances and data will develop in such a way that an exception will occur. It is possible to ideally build a system for the operation and monitoring of auxiliary billing and pre-billing systems, but it is impossible to ensure the smooth operation of equipment and data transmission channels. Therefore, there is a duplicate system that checks the billing data and the data that has gone from pre-billing to billing. Her task is to catch what went off the equipment, but for some reason "did not fall on the subscriber." This role of duplicating and controlling the pre—billing system is usually played by the FMS - Fraud Management System. Of course, its main purpose is not to control pre—billing at all, but to identify fraudulent schemes and, as a result, monitor data losses and discrepancies from equipment and billing data. In fact, there are a lot of options for using pre-billing. For example, it can be a reconciliation between the subscriber's status on the equipment and in the CRM. Such a scheme may look like this. Using SOAP pre-billing, we receive data from the equipment (HSS, VLR, HLR, AUC, EIR). Getting data from the equipment. Pre-billing of Hewlett-Packard Internet Usage Manager (HP IUM) Imagine a large meat grinder, into which meat, vegetables, loaves of bread are thrown — everything that is possible. That is, there are a variety of products at the entrance, but at the exit they all take the same shape. We can change the grate and get a different shape at the output, but the principle and way of processing our products will remain the same — auger, knife, grate. This is the classic pre-billing scheme: data collection, processing and output. In IUM pre-billing, the links in this chain are called encapsulator, aggregator, and datastore. Here it is necessary to understand that at the entrance we must have completeness of data — a certain minimum amount of information, without which further processing is useless. If there is no block or data element, we receive an error or a warning that processing is impossible, since operations cannot be performed without this data. Therefore, it is very important that the equipment generates record files that have a strictly defined and manufacturer-defined set and type of data. Each type of equipment is a separate processor (collector) that works only with its own input data format. For example, you can't just drop a file from CISCO PGW-SGW equipment with Internet traffic from mobile subscribers to a collector that processes the stream from Iskratel Si3000 fixed-line equipment. If we do this, then at best we will get an exception during processing, and at worst we will have all the processing of a particular stream, since the collector handler will fall with an error and wait until we solve the problem with the file that is "broken" from its point of view. It can be noted here that all pre-billing systems, as a rule, critically perceive data that a specific collector processor has not been configured to process. Initially, the stream of parsed data (RAW) is formed at the encapsulator level and can already be transformed and filtered here. This is done if it is necessary to make changes to the flow before the aggregation scheme, which should be further applied to the entire data flow (when it passes through various aggregation schemes). Files (.cdr, .log, and others) with records of subscriber user activity are received from both local and remote sources (FTP, SFTP), and options for working with other protocols are possible. The parser parses files using different Java classes. Since the pre-billing system in normal operation is not designed to store the history of processed files (and there may be hundreds of thousands of them per day), after processing, the file on the source is deleted. For various reasons, the file may not always be deleted correctly. As a result, it happens that records from a file are processed repeatedly or with a long delay (when it was possible to delete the file). To prevent such duplicates, there are protection mechanisms: checking for duplicates of files or records, checking for time in records, and so on. One of the most vulnerable points here is data size criticality. The more data we store (in memory, in databases), the slower we process new data, the more resources we consume and eventually we still reach the limit after which we are forced to delete old data. Thus, auxiliary databases (MySQL, TimesTen, Oracle, and so on) are usually used to store this metadata. Accordingly, we get another system that affects the work of pre-billing with the resulting security issues. More on the topic: Hiding data from the provider How does prebilling work? Modern pre—billing is a set of modules, usually written in Java, which can be controlled in a graphical interface using standard copy, paste, move, and drag operations. Working in this interface is simple and straightforward. For work, the operating system based on Linux or Unix is mainly used, less often Windows. The main problems are usually related to the testing or error detection process, as the data passes through a variety of rule chains and is enriched with data from other systems. It is not always convenient and understandable to see what happens to them at each stage. Therefore, we have to look for the reason, catching changes in the necessary variables using logs. The weakness of this system is its complexity and the human factor. Any exception provokes data loss or incorrect data generation. Go back |
13-02-2024, 08:07 |